+ 


ez 5 et S ; 4 ; » 
.SITY OF ILLINOIS BULLETIN 
. fae Issuep WEEKLY 
~ Vol. XXI Novemser 26, 1923 No. 13 
- [Entered as second-class matter December 11, 1912, at the post office at Urbana, Illinois, under the 


Act of August 24, 1912. Acceptance for mailing at the special rate of postage provided for in 
section 1103, Act of October 3, 1917, matherised July 31, 1918.] 


BULLETIN No. 17 


BUREAU OF EDUCATIONAL RESEARCH 
COLLEGE OF EDUCATION 


THE PRESENT 

STATUS OF WRITTEN EXAMINATIONS 
AND SUGGESTIONS FOR THEIR 

IMPROVEMENT 


% 
@ 
y 
za 
a 


By 


Watter S. Monroe 
Director, Bureau of Educational Research 


Assisted by 


Luoyp B. SoupeErs 
Formerly Assistant in Bureau of Educational Research 


PRICE 50 CENTS 


PUBLISHED BY THE UNIVERSITY OF ILLINOIS, URBANA 
1923 


The Bureau of Educational Research was established by act 
of the Board of Trustees June 1, 1918. It is the purpose of the 
Bureau to conduct original investigations in the field of education, 
to summarize and bring to the attention of school people the results 
of research elsewhere, and to be of service to the schools of the 


state in other ways. 


The results of original investigations carried on by the Bureau 
of Educational Research are published in the form of bulletins. A 
complete list of these publications is given on the back cover of 
this bulletin. At the present time five or six original investigations 
are reported each year. The accounts of research conducted else- 
where and other communications to the school men of the state 
are published in the form of educational research circulars. From 
ten to fifteen of these are issued each year. 


The Bureau is a department of the College of Education. Its 
immediate direction is vested in a Director, who is also an instructor 
in the College of Education. Under his supervision research is 
carried on by other members of the Bureau staff and also by grad- 
uates who are working on theses. From this point of view the 
Bureau of Educational Research is a research laboratory for the 
College of Education. 


Bureau or Epucationat Researcu 
College of Education 


University of Illinois, Urbana 


4 = 


_ BUREAU OF EDUCATIONAL RESEARCH 
_ COLLEGE OF EDUCATION — 


THE PRESENT 


_ AND SUGGESTIONS FOR THEIR 
IMPROVEMENT 


By + 


ee Watrer S. Monroe 
Director, Bureau of Educational Research 


Assisted by 


| sa 
| a 


Lioyp B. SoupERs 


Formerly Assistant in Bureau of Educational Research 


PRICE 50 CENTS 


: PUBLISHED BY THE UNIVERSITY OF ILLINOIS, URBANA 
ty 

; 1923 

i 

( 


| 
| 


TABLE OF CONTENTS 


Written examinations tend to encourage - undesirable 
Mee RI I EOCES SCS ERM ds en ee gir oe mere ores meena 


Passing the final examination an undesirable objective.... 
Examinations injurious to health of students............ 


Time devoted by teachers to written examinations not 
Bronce Dine sDCU bonne say ter ecw publi nee tia 


CHAPTER JIJ].—PREPARATION AND ADMINISTRATION OF EXAM- 


ENA TIONS AN- HIGH SCHOOL koe ti. wedi a fue hie te ad 
Piero at ecOllectetr sey ite sae cia crite «es Sis wre +o ReRS 
Requirement of final examinations in Illinois high schools 
Time devoted to written examinations.................. 
Characteristics noted in marking examination papers..... 
RVeroneneror Gg ueStionsyes so) fete girs > doin 975 vad a Hin he 
PecmeaitlOnrolcatcrorewOrk) icy «ase dels coy estore wets ols 
Methods of marking examination papers................ 
Directions to students concerning methods of work...... 


Recognition of a standard distribution in assigning grades 
EQEXAMINALIONED APES: or. ie aise acc elves « auaten nom Fs 


Relation of examination grades to final grades........... 


EMEDUA ALY eee ernest ae ie See cae cis eae 9 de hes 


CHAPTER LV.—THE CONSTANT AND VARIABLE ERRORS IN EXAM- 
TNATION- GRADES at eect oie ene eens Me att 


Constant and variable errors of measurement............ 


Magnitude of variable errors of measurement in stand- 
ardized test scores and in examination grades.......... 


Methods employed in present investigation concerning 
reliability of written examinations: : 7. a.: sae ae 


Data collected for investigation... .s-+ 25 00%<a+em eee 


Reliability of written examination grades and of stand- 
ATGIZeEd CESt SCOLES ward eagbeoas aerek eee as ate ee 


Conditions tending to produce variable errors of measure- 
mentan examination. gtades. 12.6. + ae ay en ee 


Magnitude of constant errors of measurements in stand- 
ardized test scores and in examination grades.......... 


Results of present investigation and of previous studies 
COMPALED earch rene Sn Aiea Peles bine Late nea) eee ee 


Conclusion—relative accuracy of examination grades and 
OUCESE SCOPES Him dus 5 Zo eapre ke! Ee 5 us eae ae ee 


CHAPTER V.—THE CONTENT OF WRITTEN EXAMINATIONS 


The data collected 


Classification of questions 


Relation of examination questions to educational ob- 
jectives., 


CuHapTer VI.—THE IMPROVEMENT OF WRITTEN EXAMINATIONS. 


Reduction of constant errors 


VuNeikige @: Joh ae liste! 8) ce, 7o/%e) ioluRl (OWLS) SALw ole) Go ednemis es 


Reduction of variable errors 


Agreement of content of examinations with educational 
objectives 


Simplification of administration of written examinations. . 


CHAPTER VII.—RuLES FOR THE PREPARATION AND ADMINIS- 
TRATION OF WRITTEN EXAMINATIONS 


Ci foie) © 1G) 4: le) le tek ave twue! enet a) oe 


APPENDIX 


Zi 
a7 


28 


29 
30 


oH 


ai 


40: 


41 


PREFACE 


This bulletin reports the results of three extensive 
investigations relating to written examinations. These 
investigations were made by Mr. Souders under the di- 
rection of the Director of the Bureau of Educational Re- 
search. The tabulations and statistical calculations were 
made by Mr. Souders or by clerks working under his im- 
mediate direction. The preparation of the published re- 
port, however, is the work of the Director of the Bureau. 

The Bureau of Educational Research wishes to ac- 
knowledge its indebtedness to the superintendents, prin- 
cipals, and teachers who cooperated by furnishing the 
necessary data. The data required in the study of relia- 
bility of written examinations necessitated considerable 
additional labor. Without their cooperation these in- 
vestigations would not have been possible. 


Wa ter S. Mowrog, Director. 
November 1, 1923. 
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PRESENT STATUS OF WRITTEN EXAMINATIONS 
AND SUGGESTIONS FOR THEIR 
IMPROVEMENT 


CHAPTER I 
INTRODUCTION 


Preparation and administration of written examinations im- 
portant phases of the teacher’s work. Written examinations, 
except in the few schools where they have been abolished, form a 
very important phase of the teacher’s work, both because of the 
time devoted to their preparation and administration and of the 
significance attached to the measures which they yield. The final 
grades upon which promotion and the awarding of school honors 
depend are determined largely by final examinations and by writ- 
ten tests given during the school term. Altho standardized edu- 
cational tests have become widely used during recent years, 
written examinations are still the most frequently used type of 
measuring instrument. This will probably always be true, par- 
ticularly in the high school. Hence, we may expect that written 
examinations will occupy in the future as in the past, an important 
place in the work of our schools. 

Need for more information concerning written examinations. 
There have been numerous investigations which showed that 
the marking of written examination papers is highly subjective— 
that is, different teachers tend to assign different marks to the 
same paper. With the exception of these studies relatively little 
precise information is available in regard to written examinations 
but a number of criticisms based upon experience and theoretical 
considerations have been made. As a result many teachers and 
other school officials have come to consider written examinations 
very inferior instruments and have abolished them in a number of 
schools. 

A search through our educational literature, particularly 
textbooks, reveals an astonishing lack of information in regard to 
written examinations. Relatively little specific attention has been 
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given to their preparation and administration in our courses for 
the training of teachers. Inexperienced teachers have been left 
largely to their own devices in this important phase of their work. 
Careful inquiry and observation have indicated that there isa variety 
of practises with reference to the types of questions asked and 
to the administration of written examinations. Hence it appears 
that there is need for a comprehensive investigation of the present 
status of written examinations in order that a more intelligent esti- 
mate may be formed of their value in the process of education and 
that specific directions may be formulated in regard to their 
preparation and administration. 

Purpose of this bulletin. It is the purpose of this bulletin to 
present (1) a brief summary of certain previous investigations re- 
lating to written examinations and also of the arguments for and 
against written examinations; (2) the results of three extensive in- 
vestigations conducted by the Bureau of Educational Research, 
(a) the preparation and administration of written examinations in 
Illinois high schools, (b) the constant and variable errors in exami- 
nation grades, and (c) the content of written examinations; and 
(3) suggestions for the improvement of written examinations. In 
the concluding chapter the author presents a list of rules in regard 
to the preparation and administration of written examinations.. 
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CHAPTER II 


SUMMARY OF CRITICISMS OF WRITTEN EX- 
AMINATIONS! 


Plan of chapter. In this chapter the important criticisms of 
written examinations are briefly summarized. Following each 
criticism either a brief answer is given or a reference is made to a 
detailed discussion in a later chapter. By presenting both sides 
of the question in this way, it is hoped that the reader will be 
assisted in forming an intelligent estimate of the merits of written 
examinations. 

I. Examinations yield inaccurate measures of school achieve- 
ment. In support of this argument six points have been made. 

1. The most important criticism relating to the accuracy of 
written examinations is that the marking of the papers is highly 
subjective. A large number of scientific investigations have 
yielded objective evidence that different teachers when working 
independently tend to assign widely varying marks to the same 
paper. One of the first studies of this type was by Starch and 
Elliott who found that the marks assigned to the same examina- 
tion paper in Plane Geometry by 116 teachers ranged from 28 to 
92 on the scale of 100 percent. The facts of such investigations as 


_ this can not be disputed but as we have no means of securing per- 


fectly accurate measures of achievement, the question at issue 
concerns the relative rather than the absolute accuracy of the 
measurements secured. Facts may be misinterpreted. In Chapter 
IV we shall present evidence to show that when judged in relation 
to other means for measuring school achievement, written exami- 


1Starch, Daniel, and Elliott, E. C. “Reliability of grading high-school work in 


mathematics,” School Review, 21:254-59, April, 1913. 

Morton, Robert L. “The examination method of licensing teachers,” Education- 
al Administration and Supervision, 6:421, November, 1920. 

Wood, Ben D. “Measurement of college work,” Educational Administration and 


Supervision, 7:301-34, September, 1921. ere 
Kelly, F. J. ‘““Teachers’ marking,” Teachers College Contributions to Education, 


No. 66. New York: Teachers College, Columbia University, 1914. 
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nations yield relatively more accurate measures than generally 
supposed. In view of the additional information secured the sub- 
jectivity of written examinations loses much of its potency as a 
reason for their abolition. 

2. The questions of ordinary examinations are usually not 
equal in difficulty and weighting by teachers is highly subjective.’ 
It has been inferred that this condition tends to increase mater- 
jally the inaccuracy of examination marks. Comparisons of 
weighted and non-weighted scores yielded by standardized tests 
have revealed that the errors introduced by disregarding the unequal 
difficulty of exercises or questions are not significant in most cases.® 

3. It has been pointed out that frequently the content of 
written examinations is not in agreement with recognized educa- 
tional objectives. Catch questions relating to trivial facts or 
worded in a misleading way have been cited as illustrations. 
Certain examination questions also have referred to items which 
had not been included in the course or at least had received only 
minor emphasis. Some evidence with reference to the justifica- 
tion of this criticism will be presented in Chapter V. 

4. In most examinations the rate of work is neglected. The 
usual practise is to allow sufficient time for all pupils to finish or 
to base the mark only on the questions answered in the unfinished 
papers. Hence a student’s examination grade is not influenced by 
the rate at which he answers the questions. It is easily possible to 
take into account the student’s rate of work in determining the 
mark assigned to his examination paper. One plan is to set an ex- 
amination of sufficient length so that all members of the class will 
be employed during the entire period. Another procedure is to 
have the student record the time when he finishes. In this way 
some weight can be given to his rate of work. This criticism is, 
however, a minor one. In some subjects the rate of work is an im- 
portant consideration but in others, particularly those in which 
reasoning predominates in answering the questions, the neglect of 
the rate of work will affect the accuracy of the examination marks 
only slightly, if at all. 


*Comin, Robert. “Teachers’ estimates of the abilities of pupils,” School and 
Society, 3:67-70, January 8, 1916. 


Charters, W. W. “Constructing a language and grammar scale,” Journal of Edu- 
cational Research, 1:249-58, April, 1920. 


Monroe, Walter S. “The description of the performances of pupils on exercises 
of varying difficulty,” School and Society, 15:341-43, March, 1922. 
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5. Written examinations are usually so short that they do not 
offer an adequate opportunity for a student to demonstrate his 
ability. This criticism is frequently expressed in the statement 
that it is unjust to base a student’s standing for a semester or a 
year on an examination paper written during a brief examination 
period. When stated in this way the criticism refers to two issues 
between which there is failure to distinguish. The first is in regard 
to the weight allowed the examination grades in determining a 
student’s final standing. This question of the weight given the 
final examination is discussed in a later chapter, but it may be 
said here that the usual practise in high schools is to count the 
written examination as one-third of the student’s total grade. The 
second refers to the inaccuracy of the grade due to the limited 
opportunity which is given the student to demonstrate his ability. 
For practical reasons it is necessary that measurement of the total 
achievement for the term be based upon a sample. In general, in- 
creasing the scope of the examination will tend to increase the 
accuracy of the measures yielded. Some evidence with reference to 
the reliability of examination grades based upon short samples 
will be presented in Chapter IV. It is possible for a teacher to 
make examinations more comprehensive. This can be accomplish- 
ed in part by exercising more care in the preparation of the ques- 
tions. The “new examination” in which pupils are required to do 
little or no writing affords one means for covering a wide range of 
subject-matter in a brief period. This method of improving ex- 
aminations will be discussed in Chapter VI. 

The final point to be made with reference to the inaccuracy 
of examination marks refers to the distinction between a “score” 
which describes a pupil’s performance on the examination and 
a “grade” which interprets this score with reference to a norm. 
Failure to recognize this distinction is primarily responsible for 
too high grading by some teachers and too low by others. Even 
the same teacher is likely to assign “high grades” on some ex- 
aminations and “low grades” on others. 

In order to understand how norms (standards) are used in 
connection with the grading of examination papers it is necessary 
to distinguish between “scores,” or measures, and “orades,” or 
marks. A “score” simply describes the performance which has 
been recorded on the examination paper. For example, a pupil 
may answer 55 percent of the questions correctly. In this case 55 
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is his “score.” If a certain number of points or credits had been 
given for each question his score might be 129, or 91, or 217. A 
“orade” interprets this description with reference to certain 
norms. A “grade” indirectly describes a pupil’s performance on an 
examination, but it tells also whether the performance is to be 
considered as above or below passing; whether the pupil is to re- 
ceive the highest mark or the lowest mark or an average mark. 

It is customary to describe the quality of examination 
papers in terms of the percent of questions answered correctly. 
For example, if an examination includes ten questions and a 
pupil answers seven of them correctly and an eighth one partially 
right, he is given a score of 75 percent, whichisinterpreted to mean 
that in the judgment of the examiner he has answered the ques- 
tions 75 percent correctly. School marks or “grades” are also 
frequently expressed in terms of percents. Sometimes they are 
expressed in terms of letters or other symbols, but these in turn 
are defined in terms of percents. For example, the grade of “A” 
may be defined as being between 95 percent and 100 percent. 
Since both “scores” and “grades” are generally expressed in 
terms of percents, it is only natural that the two have been con- 
fused and that “‘scores”’ have been used as “grades.” 

A good illustration of their difference came to the writer 
recently. An examination in mathematics was given to nearly 
1000 freshmen in one of our large universities. This examination 
may properly be described as “hard,” considering the training 
which the students had received. One student made a score of 
100. The lowest score was 12. The average was approximately 
55. From the standpoint of the distribution of scores this was a 
“good examination.” If it had been easier, so that any consider- 
able number of pupils received scores of 100 percent,it would have 
been unsatisfactory. If it had been so “hard” that a considerable 
number of students made zero scoresit would also have been defect- 
ive. In both cases it would have failed to differentiate between 
some students who were not equal in ability. However, obviously an 
injustice would be done if a passing mark of 70 or 75 were adopted 
and all pupils having scores below this mark were given a grade 
of failure. The passing mark for this particular examination should 
be in the neighborhood of 40. If the ‘‘scores” are to be represented 
in terms of “grades” a “score” of 40 should be translated into a 


“grade” of 70 or whatever passing mark has been adopted by the 
institution. 
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, The recognition of this distinction between “‘scores” and 

grades” enables us to indicate the way in which subjective norms 
are implied in “grades.” A “grade” is not a pure measure or 
description of the pupil’s performance. It is rather an interpre- 
tation of the measure of his performance with reference to certain 
norms. When no distinction is made and “scores” are used as 
“grades,” pupils will receive high “‘grades’’ if the examination is 
“easy;” and low ones if it is “‘hard.’”’ Thus, the difficulty of the 
examination is one factor in establishing the norms with reference 
to which the “scores” are interpreted when they are used as 
“orades.” Severe marking will tend to set high norms. Only when 
the examination is of average or “standard” difficulty and the 
marking is average in severity do “scores” and “grades” become 
identical in magnitude. Since the norms are established by the 
difficulty of the examination and the severity of the scoring, they 
must be subjective. In the investigations of the marking of ex- 
amination papers it was shown that teachers varied widely in 
their judgments concerning the worth of examination papers. 
There is no reason to expect that they would agree more closely 
in estimating the difficulty of examinations. Hence, norms which 
depend upon teachers’ estimates of the questions appropriate for 
examinations and upon their marking of the papers must be con- 
sidered subjective. It is possible to increase greatly the objectivity 
of these norms and the first requirement is to recognize the dis- 
tinction between “‘scores” and “grades.” (See’ page 38 for a 
further consideration of this topic.) 

Summary of inaccuracy of examination marks. From the 
preceding discussion examination marks are, without doubt, 
shown to be far from accurate measures of school achievement. 
However, it does not necessarily follow that the errors involved 
are of sufficient magnitude to justify the abolishment of written 
examinations. In the writer’s belief the greatest benefit will come 
from making an intelligent inquiry into the nature of these errors 
and from taking steps to reduce them to the lowest magnitude. 

II. Written examinations tend to encourage undesirable 
mental processes. Many critics have claimed that most exami- 
nations, particularly those given at the end of a course, tend to 
encourage “cramming.” The assertion is made that many stud- 
ents do little or no studying until near the close of the term. Then 
by the process of “cramming” they are able to pass the final ex- 
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amination and-attain a relatively high standing in the course. 
This criticism assumes that “cramming” is an undesirable mental 
process and that final examinations are responsible for its occur- 
rence. The undesirable feature is the neglect of study throughout 
the term. This is not due to the fact that final examinations are 
given but that undue emphasis is placed upon them and that the 
teacher has failed to check up on the student’s work day by day 
throughout the term. 

One of the points which may be made in favor of final ex- 
aminations is that they furnish an immediate incentive for review 
and organization of the content of the course. The writing of an 
examination itself may be an important part of the student’s 
learning. This is particularly true in the case of questions which 
require reasoning and organization of information. ““There is no 
impression without expression,” and the writing of a three-hour 
examination is undoubtedly an intensive form of expression. 
Hence, one is justified in maintaining that written examinations 
tend more to encourage desirable mental processes than undesir- 
able ones. 


III. Passing the final examination an undesirable objective. 
The assertion has been made that when a final examination is 
required, the passing of it tends to become the objective for which 
many students work. When this occurs it is due not to the fact 
that the final examination is required but rather to the undue 
emphasis which is placed upon it by the school. If an examination 
consists of appropriate questions it is not undesirable to have the 
student keep it in mind as one of the objectives to be attained by 
studying the subject-matter of the course. However, as we shall 
show later, (see page 25) the usual practise is to count the final ex- 
amination grade as one-third in determining a student’s final 
standing. In many schools it receives less weight. When the final 
examination counts only one-third or less in determining a stud- 
ent’s final standing it is difficult to say in what respect it forms an 
important educational objective. 

IV. Examinations injurious to health of students. Some 
critics claim that written examinations, particularly those given 
at the end of a course, are injurious to the health of students, 
many of whom make very strenuous preparation for them. The 
obvious strain which accompanies the writing of answers to the 
questions of examinations sometimes lasting two or three hours 
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must also be borne. It is undoubtedly true that both the prepar- 
ation and the writing frequently make enormous drains on the 
energies of students. However, no careful investigation has been 
conducted of the actual effect upon their health. To one who ob- 
serves the great expenditures of time and energy devoted to social 
and athletic activities, it is difficult to believe that examinations 
are in general more injurious to the health of students than many 
other activities in which they are permitted and even encouraged 
to engage. Here again it should be realized that this criticism 1s 
not fundamentally a criticism of examinations, but rather of 
setting very long examinations or of placing extreme emphasis 
upon them by making the final grade of the course depend wholly 
or very largely upon the examination grade. 


V. Time devoted by teachers to written examinations not 
profitably spent. In the opinion of some critics the time given to 
the preparation of questions and particularly to the marking of 
examination papers might be more profitably employed. Infor- 
mation concerning the time actually devoted to the preparation 
and the administration of written examinations is givenin Chapter 
III. However, it may be pointed out here that a teacher can not 
attain a high degree of efficiency as an instructor unless he checks 
up the work of his students in order to assist those who need 
supplementary and remedial instruction. Only by knowing the 
extent to which his students have achieved individually and col- 
lectively can a teacher make his instruction fit the needs of his 
class. Thus considerable time must be given to measuring the 
results of teaching. This is an indispensable portion of the teach- 
er’s task. It is only when a teacher devotes an undue proportion 
of his time to the preparation and administration of examinations 
that such work tends to be wasted. Doubtless, the time devoted to 
written examinations might in many cases be profitably increased. 
Students receiving low marks should have their answers studied in 
order to ascertain in what ways and why they have failed. Such 
information will frequently be exceedingly illuminating to the 
instructor, and aid him in determining his own shortcomings. 
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CHAPTER III 


PREPARATION AND ADMINISTRATION OF 
EXAMINATIONS IN HIGH SCHOOLS 


The data collected. The purpose of the study reported in this 
chapter was to secure information concerning the present practise 
in the preparation and administration of written examinations in 
high schools. A questionnaire was mailed in the fall of 1922 to 254 
high-school principals in Illinois and a second one was sent to 
approximately 2900 high-school teachers.1 One hundred and 
eighty-nine replies were received from principals and 1816 from 
teachers. Of the latter it was necessary to discard eighty so that 
the following report is based upon returns from only 1736 high- 
school teachers who are distributed as follows: 


CommercialiSubjectsis-aerier 192 Modern Languages............ 82 
Drawin and eAncess eee eee 26 IMUSICss Ae ce eee eee eae 21 
Englishitesc ae eee et ate 342 Science.ses. 5 8.20 Pane 309 
Home Economicsse,.es eens: 143 Shop Worke acne ea eee ee 58 
Latigrn ere eae eee seer wae 118 Social Scicuce mrss ane ame 198 
Mathematicss menses rs 247 


Representative character of data collected. The high schools 
from which answers to the questionnaire were received ranged 
from those established in rural communities to a large metro- 
politan high school. No supplementary investigation was made to 
ascertain the extent to which the replies were representative of 
conditions in I]linois but in the tabulations there was no indica- 
tion that the data collected were not representative of the state. 
A few of the replies, particularly those of teachers, suggest that 
some slight misinterpretation of certain of the questions may have 
been made. (See page 23). Such cases, however, were relatively 
rare and probably did not affect the median of the results. 

Extent of the requirement of final examinations in Illinois 
high schools. Evidence of the subjectivity of the marking of ex- 
amination papers, together with other adverse criticisms of written 
examinations, has tended to cause many teachers and superin- 
tendents to be skeptical of their value. In a number of schools 


These questionnaires are reproduced in the appendix on pages 66 and 68. 
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final examinations have been abolished or made optional with the 
teachers and they are not considered essential by many teachers. 
In order to ascertain the present practise in Illinois the high-school 
principals were asked, ‘““Do you require your teachers to give final 
examinations?” Only twenty-one principals or 11 percent stated 
that final examinations were not required. Thus it is the practise 
in Illinois high schools to require that final examinations be given. 
This, however, does not mean that all students must take them. 
Of the 168 high schools in which final examinations are required 
101 or 60 percent reported that it was their practise to exempt 
certain students. Scholarship, that is making a grade on daily 
work above a certain average, was mentioned by all of these 
schools as one of the conditions on which exemption was based. 
Deportment was mentioned by 52 percent and attendance by 32 
percent as additional conditions. 

No information was secured with reference to the explanation 
of the exemption from examinations of students meeting certain 
conditions but general observation has indicated that two reasons 
are frequently recognized. The first is that promise of exemption 
from the final examinations operates as a powerful motive to 
secure a high quality of daily work, regular attendance, and good 
deportment. The other is the belief held by many teachers that 
final examinations are unnecessary to determine a student’s stand- 
ing in a course. They contend that the average of a student’s 
daily grades should be taken as a final grade for the course. 

There is no doubt that the promise of exemption from the 
final examination operates as a powerful motive in the case of 
many students. It should, however, be recognized that such an 
incentive is artificial and therefore open to criticism. In so far as 
possible a student should be actuated by motives which sustain 
an intrinsic relation to the subject-matter. If it is necessary or 
advisable that the final examination be considered as a motive, 
it could be used to encourage systematic review and organization 
of the course. This should constitute a very important phase of 
studying. Students may, of course, be asked by their teachers to 
review frequently and to summarize and organize the work at the 
end of the term, but they cannot be convinced easily of the 
necessity of such work if it receives no more weight in determining 
their final grade than their performances during an equal period 
of time elsewhere in the course. 
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The second reason is a valid one in many cases. In the experi- 
ence of most teachers the mark made on the final examination 
changes the standing of relatively few students. Experienced 
teachers can under favorable conditions estimate with consider- 
able accuracy the achievements of their pupils. If the class is 
reasonably small and if the teacher has used methods of instruc- 
tion which call for frequent oral and written performances by the 
students and has kept a careful record of these performances 
throughout the term, his estimates will generally be relatively 
accurate measures of the achievements of the students. There are, 
however, certain limitations which should be noted. Teachers 
may be unduly influenced in their estimates by the more recent 
performances of their students. Unless careful records have been 
kept throughout the term inferior work at the beginning tends to 
be overshadowed by good or excellent work during the closing 
weeks. In case the class is a large one the teacher may not have 
an adequate opportunity for becoming acquainted with all of its 
members. 

Teachers’ estimates are likely to be materially affected by 
personal characteristics of students; one with a pleasing person- 
ality is in many cases rated higher than one who is unattractive. 
If the classwork is conducted so that there is little or no written 
performance required, teachers’ estimates will necessarily be based 
almost wholly on the oral responses given during the class period. 
Some pupils make a good showing in class when the recitation is 
oral but are at a decided disadvantage when asked to record their 
answers in writing. Frequently this difficulty is encountered 
when they are careless in their thinking and do not have clear 
ideas to express. In oral recitation they are able to make a fair 
showing because of personal characteristics and because of the 
stimulus of detailed questioning by the instructor. Furthermore, 
in a class discussion a bright student who has a good command of 
language may easily pick up ideas from other members of the class 
and recall ideas from his general experience sufficient to make a 
good showing. On the other hand there are students who express 
themselves more effectively in writing. They may be good thinkers 
but a little slow in their mental processes and not clever in dis- 
cussion. Thus there are cases in which it is difficult or impossible 
for a teacher to estimate accurately the real achievements of 
students from their daily work alone. The final examination at the 
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end of the term will in a considerable number of cases furnish ad- 
ditional information which is needed in arriving at the student’s 
true standing. 

The final examination in itself provides a distinct type of edu- 
cational opportunity which does not occur elsewhere in the course. 
Altho the writers have no evidence to present upon this point 
they are convinced from their experience with college students 


~ and from the comments of a number who have been exempted from 


final examinations in high school that the practise deprives stud- 
ents of an important educational opportunity. Not infrequently 
students who have been “excused from examinations” in high 
school state that they experienced a distinct handicap when they 
entered college. If final examinations can be justified they should 
be required of all students. To use them only as a device for moti- 
vating the work of the term destroys much of their value. 


Time devoted to written examinations. Three questions were 
asked relative to the time devoted to the preparation and ad- 
ministration of written examinations. The replies from the princi- 
pals indicated that the most frequent practise is to allow ninety 
minutes for the writing of a final examination. This is the time 
allowed in 45 percent of the schools having final examinations. 
Fifteen percent allow eighty minutes and a slightly larger percent 
one hundred and twenty minutes. 

The teachers were asked to state approximately the number 
of minutes which they use’“‘in preparing questions for a final ex- 
amination which students are allowed a total of ninety minutes to 
answer.” The median time which varies only slightly for the differ- 
ent subjects is approximately fifty minutes. Individual teachers 
in the same subject differ widely in the amount of time which they 
give to this phase of their work. Two teachers, one in mathe- 
matics and one in science, stated that they spent more than six 
hours in the preparation of a set of final examination questions. 
In each subject there were a number of other teachers who stated 
that they devoted not more than thirty minutes to such work. 
It is possible that some teachers failed to interpret this question 
correctly but doubtless much of the variation is due to differences 
in the practises of teachers during the semester. Some probably 
make a memorandum of questions as they occur during the term 
and use this list as a basis for preparing the final examination. 
Also experience is a contributing factor. Teachers who have be- 
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come very familiar with the subject should be able to formulate 
questions more quickly than those who are not so well versed. 

The teachers were asked also to give the approximate time 
which they used in “marking the papers of a final examination 
which students are allowed a total of ninety minutes for answer- 
ing.” They were directed to base their estimates upon a class of 
twenty-five students. The median time is approximately two and 
one-half hours. The variations between the different subjects are 
not large when the differences in their character are considered. 
The greatest number of hours are required for English and social 
science and the least for drawing and art. Here also there were 
wide variations in the amount of time reported by the individual 
teachers. A total of twenty-five teachers in which all subjects 
except home economics, shop work, and social science were in- 
cluded, stated that they devoted not more than thirty minutes to 
marking a set of papers for twenty-five students. On the other 
hand, thirty-nine teachers stated that they spent 480 minutes or 
eight hours in the marking of a single set of papers. 

It is obvious from the replies received that some teachers 
treat this phase of their work much more seriously than others or 
that they employ widely different methods. Probably some correct 
all errors or insert references which will enable the students to 
correct their own errors when the papers are returned. Others 
merely check the errors and still others probably do not attempt 
to even check each error but estimate the worth of the paper as a 
whole. The question concerning the amount of time which a 
teacher is justified in devoting to the marking of a set of examina- 
nation papers may very profitably be raised. Final examination 
papers should be treated seriously and there should be an earnest 
endeavor on the part of the teacher to estimate as accurately as 
possible the grades which are assigned but it is doubtful if the ex- 
penditure of as much as twenty minutes per paper which was re- 
ported in some cases could be justified. The median practise seems 
to represent a more reasonable amount of time. 

Characteristics noted in marking examination papers. The 
principals were asked to state whether it was the practise in their 
schools for teachers to subtract from a pupil’s grade for (1) poor 
writing, (2) poor spelling, (3) poor English. Seventy-one princi- 
pals or 42 percent stated that teachers were accustomed to make 
deductions for poor writing. In 60 percent of the schools it was the 


[20] 


a 


TABLE I. PERCENT OF TEACHERS REPORTING INTENTIONALLY 
LOWERING A STUDENT’S GRADE FOR POOR WRITING, POOR 
SPELLING, AND POOR ENGLISH 


Subject Poor Writing|Poor Spelling/Poor English 
JeNi ies lac LARS DENS ids anthers Sen GH bIOG 31 7) 80 
Commercial’ Subjectste:- ss... ee ee 61 74 72 
Peswmerand Art 2. i). edidida pals. Sie os. 65 82 67 
JERYANG Hac. seth MANOR Dero en EOE ee 60 96 98 
HomerBiconomics Sesh cent ce ee ene: 37 72 73 
Mathematics eesti. here (estore Oe Os <5 31 48 48 
PW octerneb an GUager ae icles 7 «2.0 wo s+ « 31 82 74 
(0 U2 sees Sci aeigfs cic Aa) JUS ae ae 48 60 48 
SS NEO yen om ord Ne hoe ee Oe Eee 35 53 56 
ShoprandyVocationaliecenccssaissrsee sicieso : 32 77 74 
SOCIMMOCENCE Seema sca ie eet Si 61 65 


practise to lower a student’s grade for misspelled words and in 68 
percent for poor English. Fifteen principals or 9 percent reported 
that all three characteristics were recognized only in the marking 
of papers in courses in English. 

The teachers were asked if they intentionally lowered a 
student’s grade because of each of the three characteristics men- 
tioned in theabove paragraph. A summary of their replies is given 
in Table I, which indicates considerable variation with reference 
to the influence of poor writing, poor spelling, and poor English 
upon examination grades. Since writing, spelling and English 
may be considered essential parts of courses in English we should 
naturally expect that teachers of this subject would intentionally 
lower a student’s grade for defects in any of these characteristics. 
Outside of the subject of English, the majority of teachers do not 
lower a grade for poor writing except in commercial subjects, and 
drawing and art. With the exception of mathematics deduction 
is made by most teachers for poor spelling. The potency of poor 
English in determining a student’s grade is slightly less than that 
of spelling in a number of subjects. 

The handwriting, spelling, and quality of English which a 
student uses in writing an examination should be recognized. 
It does not, however, follow that a student’s standing should be 
intentionally lowered for poor handwriting, poor spelling, and 
poor English in school subjects other than English. When this is 
done his grade becomes a measure of these abilities as well as of 
the abilities in the field of the subject in which the examination 1s 
given. In history, for example, a student’s grade would become a 
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composite measure of his achievement in history, the legibility of 
his handwriting, the quality of his spelling and the use of grammati- 
cally correct English. As a result both the teacher and the student 
are likely to be confused concerning the shortcomings of the ex- 
amination paper. A better procedure would be to keep a record of 
the errors in spelling, poor writing, and poor English and when it 
is considered desirable a separate grade may be given covering 
these three characteristics. Credit for a course may be withheld 
until the student has brought his handwriting, spelling and Eng- 
lish up to a satisfactory standing. 


The weighting of questions. Sixty-four percent of the high 
school principals indicated that their teachers were accustomed to 
give more credit for correct answers to difficult questions than to 
easy ones. Approximately four-fifths of the teachers replying to 
the questionnaire stated that they attempted to weight examina- 
tion questions on the basis of difficulty. Thus there is a very 
definite effort to eliminate the errors introduced in examination 
grades by the unequal difficulty of questions. (See page 10.) 


Recognition of rate of work. Ejighty-two percent of the 
teachers stated that they were accustomed to set examinations 
short enough so that practically all students could answer all the 
questions. Only 32 percent noted the time which each student 
spent in writing his examination paper, and only 8 percent said 
it was their custom to set examinations long enough so that practi- 
cally no student would have time to answer all of the questions. 
Thus it is clear that relatively few teachers recognize the stud- 
ent’s rate of work in determining his grade on an examination. 

Incidentally it may be noted that when examinations are 
short enough so that practically all students can finish a great deal 
of time is wasted. Individual differences exist in all classes and it is 
not at all unusual to find some student finishing in one-third to 
one-half of the time which others devote to the examination. 
Aside from the waste of time which results from this practise, it is 
likely that the confusion caused by the leaving of those pupils who 
have finished tends to disturb the attention of those who are still 
writing. If final examinations constitute a valuable educational 
opportunity, there is no justification for wasting time. It is much 
better to set an examination long enough so that practically all 
students will be occupied for the entire period. 
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TABLE II. PERCENT OF TEACHERS GIVING AFFIRMATIVE ANSWERS 
TO FOUR QUESTIONS RELATING TO THE MARKING OF EXAMI- 
NATION PAPERS 
6e6e6e6a&s=atjtlaeaeoeleleqeyeaeeseoooememeEeeEeS=$=$=$x-:-_ eee eee 


Correct Each Ques. | One Ques. | Each paper 


Subject Answers on one on all as a 
Written paper papers whole ™ 

Ancient Language.......... 9 7s 24 20 
Commercial Subjects....... 48 82 16 31 
Drawingiand Arttgsas: oe: 40 75 25 42 
Bn elishi® ravc oS avers Seas 22 76 19 34 
Home Economics.......... 19 68 24 38 
Mathematics seme siasiiciocts 72 75 18 24 
Modern Language......... 15 74 23 33 
INEUSIC iret See Shas oe 35 80 21 42 
SCIEN Cesena > SSRs eilesess 27 72 27 on 
Shop and Vocational....... 36 75 27 37 
Social Sciences. seas sc 18 We 22 40 


Method of marking examination papers. Scientific investi- 
gation has revealed that the reliability of examination grades can 
be materially increased by adopting a systematic method in mark- 
ing papers.?, Among the procedures recommended are the writing 
out of correct answers, and the grading of one question on all of the 
papers before taking up another question. In order to ascertain 
the practise of high-school teachers in marking examination papers 
the following four questions were included in the questionnaire 
sent to them. 

1. Before starting to grade a set of examination papers do you 

write out the answers which you consider correct? 

2. Do you usually grade all the answers on one paper before 
taking up those of another paper? 

3. Do you usually grade the answers to one question on all 
of the papers before taking up the answers to a second 
question? 

4, Instead of marking the answers to each question separate- 
ly do you attempt to estimate the value of the paper as 
a whole? 

The percent of teachers giving affirmative answers to these ques- 
tions is given in Table II. A few apparent discrepancies in this 
table are due to the fact that certain teachers did not answer 
all the questions. Normally we should expect a teacher who an- 
swered the second question affirmatively to answer the third one 


negatively. This, however, did not always happen. 


*Kelly, F. J. “Teachers’ marks,” Teachers College Contributions to Education, 
No. 66. New York: Teachers College, 1914, 83 p. 
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With the exception of mathematics, it is not the custom of 
teachers to write out the answers to their questions. No data are 
at hand to show what effect this has upon the accuracy of the ex- 
amination marks. Experience with standardized tests would indi- 
cate that the failure to write out correct answers, at least in ab- 
breviated form, would operate to make the grading of examina- 
tion papers less accurate. 

About three-fourths of the teachers are accustomed to mark 
all the questions on one paper before taking up another. This plan 
has the advantage of enabling the teacher to consider a pupil’s 
performance as a whole. In the case of students who make a large 
number of errors the teacher will find this helpful in providing 
remedial instruction. It has, however, been proposed that the 
reliability of examination grades can be increased by marking the 
answers to one question on all the papers before taking up the 
answers to another question. 


Directions to students concerning methods of work. The 
high-school teachers were asked the following question, ““Do you 
prepare in written form carefully worded directions to the stud- 
ents regarding the procedure they are to follow in answering the 
questions? (These directions might include such points as, order 
in which questions are to be answered, length of answer, arrange- 
ment of work, etc.)” Only 35 percent of the teachers gave an 
affirmative answer. It is possible that in some classes there is a 
sufficiently definite understanding concerning the methods of 
work to be followed and explicit directions are unnecessary. 
However, it is likely that in many cases students would be able to 
give more truthful evidence of their ability if they were given pre- 
cise directions concerning the length of answer, desired arrange- 
ment of work, etc. The order in which the questions are to be 
answered is a point which should be stressed. In the case of 
questions that are at all indefinite or general there should be speci- 
fications concerning the degree of elaborateness which is expected 
in the answer. 


Recognition of a standard distribution in assigning grades to 
examination papers. The teachers were asked the following ques- 
tion, “In assigning grades to examination papers do you attempt 
to have their distribution conform to any standard form such as 
the normal distribution?’ Only 31 percent of the teachers gave an 
affirmative answer to this question. This probably means that 
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relatively few teachers have recognized the distinction between 
“scores” and “‘grades,” (See pages 11-13 for an explanation of this 
distinction.) and for this reason are neglecting one means of making 
their grades more accurate measures of school achievement. 


Relation of examination grades to final grades. The princi- 
pals were asked if they advised their teachers as to the proportion 
of the final mark for the semester which should be based upon the 
final examination. Only eleven or 7 percent replied negatively. 
Of those who gave advice on this matter 95 percent made a definite 
ruling. The most frequently mentioned proportions allotted the 
final examination are 25, 30, 33%4,and 40 percent. In4 percent of 
the schools the examination counts for one-half in determining a 
student’s final grade, in 1.3 percent for only one-tenth of the final 
grade. The teachers were also asked this same question. The 
replies varied from 10 percent to 50 percent. Except in science and 
shop work, the median practise is to estimate the final examina- 
tion mark as one-third in determining a pupil’s final grade. A 
considerable number of teachers indicated that they gave not 
more than one-fourth or one-fifth value to the final examination. 


Summary. The typical practise with reference to final ex- 
aminations in Illinois high schools may be summarized as follows: 


1. Final examinations are required of students and exempt- 
ions are made largely on the basis of scholarship. 

2. Students are allowed ninety minutes for writing a final 
examination. Teachers spend slightly less than one hour in pre- 
paring examination questions and from two to to three hours in 
grading a set of papers for twenty-five students. 

3. With the exception of mathematics, the majority of teach- 
ers lower a student’s grade for spelling, and with the exception of 
mathematics and music the majority lower it for poor English. 
Poor writing is not a potent factor except in English, commercial 
subjects, and drawing and art. 

4. About three-fourths of the teachers attempt to weight the 
questions on the basis of difficulty. 

5. The majority of the teachers do not consider rate of work 
in estimating the grade assigned the final examination paper. 

6. The majority of teachers do not write out the answers to 
the questions preparatory to the marking of papers. The general 
practise is to mark all answers on one paper before taking up the 
next. ad 
7. About one teacher in three writes out directions with 
reference to the procedure which the students shall use in answer- 


ing the questions. 
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8. About one teacher in three tries to have his grades con- 
form to a standard distribution. 

9. The proportion of the final mark for the semester which is 
based upon the final examination grade varies from 10 to 50 per- 
cent. The median practise is 334% percent. The majority of the 
principals make a definite ruling regarding the value placed upon 
the final examination. 
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CHAPTER IV 


THE CONSTANT AND VARIABLE ERRORS IN 
EXAMINATION GRADES 


Constant and variable errors of measurement.! Two types 
of errors are encountered in educational measurement. The pres- 
ence of variable errors is indicated when a test is given twice to 
the same group of pupils. The two average scores of the group 
may be the same but this will not be true for individual pupils. 
A few pupils will make the same or approximately the same score 
on the two trials. Others will make higher scores on the second 
trial than on the first, while still others will make lower scores on 
the second trial than on the first. If we assume that the average of 
the two scores obtained represents an approximately true measure 
of a pupil’s achievement then the differences between the first set 
of scores and the corresponding average scores would be the vari- 
able errors of the measures resulting from the first application of 
the test. Some of these differences approximate zero. Some of 
them are positive and about an equal number negative. Another 
set of variable errors would be obtained by using the scores se- 
cured from the second application of the test. In the case of a 
number of pupils the variable errors for the first application of the 
test will not be the same as those for the second application. Thus, 
as the name implies, variable errors change in magnitude from 
pupil to pupil within a group and also for the same pupil itiew 
series of measurements of the same achievements. 

A constant error is the same for all members of a group. 
Such an error occurs in teachers’ marks where there is a tendency 
to grade too high or too low. It is found in the case of standard- 
ized educational tests when mistakes occur in the time allowed or 
when other departures are made from standard testing conditions. 
A constant error may be either positive or negative and it is gen- 
erally different for different tests. 


1For a more detailed discussion of the nature and magnitude of the constant and 
variable errors of educational measurement, see Monroe, Walter S. “The constant 
and variable errors of educational measurements.” University of Illinois Bulletin, 
Vol. 21, No. 10, Bureau of Educational Research Bulletin No. 15. Urbana: University 


of Illinois, 1923. 30 p. 
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These two types of errors usually occur in combination, that 
is, a given measurement may and frequently does involve both a 
constant error and a variable error. The actual error is a combi- 
nation of these two. However, in studying the accuracy of edu- 
cational measurementsit is helpful to distinguish between the two 
types and to consider each separately. The usual method used 
for calculating an index of the magnitude of the variable errors 
does not give any indication of the magnitude of the constant 
error. Also the method commonly used for determining the pres- 
ence and probable magnitude of constant errors does not yield 
an index of the variable errors. Furthermore, different methods 
are required for decreasing the two types of errors in educational 
measurements. 


Methods of describing the magnitude of the variable errors 
of measurement yielded by standardized educational tests. In 
describing the magnitude of variable errors in the measures yielded 
by standardized educational tests, the usual method is to have the 
test given twice to a typical group of pupils under as nearly the 
same conditions as possible. The coefficient of correlation be- 
tween the two sets of measures is taken as the index of the magni- 
tude of the variable errors. Usually there will be a constant error 
in one and sometimes in both of the sets of measures but the sta- 
tistical procedure used is such that this error does not affect in any 
way the coefficient of correlation secured. This coefficient of cor- 
relation is commonly spoken of as the coefficient of reliability. 
A coefficient of 1.00 would mean that the variable errors were 
zero. 


Available data with reference to the magnitude of errors of 
examination grades and standardized test scores not comparable. 
Investigations of the Starch-Elliott type have proven that ex- 
amination grades involve errors but the method which they em- 
ployed is different from that used in studying the errors in the 
scores yielded by standardized educational tests. Starch and 
Elliott confined their efforts to a study of the subjectivity of the 
marking of a single examination paper. Except in the use of 
quality scales, as in handwriting and English Composition, the 
scoring of standardized educational tests has been made highly ob- 
jective. Hence, there has been little need for studying the sub- 
jectivity of the marking of test papers. On the other hand, there 
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has not been, so far as the writers are aware, any reported attempt 
to apply to written examinations the method commonly used in 
studying the reliability of standardized educational tests. Hence, 
comparisons to show the relative reliability of the two types of 
measuring instruments cannot as yet be made. 

For this reason in the present investigation it has seemed 
worth while to apply to written examinations the same method 
which is commonly used in studying the reliability of educational 
tests. Certain modifications are of course necessary. These will 
be noted in the following paragraphs. “The investigation pertains 
primarily to the variable errors involved in examination grades. 
Incidentally some light will be thrown upon the magnitude of con- 
stant errors. 


Methods employed in the present investigation of the relia- 
bility of written examinations. The essential feature of the meth- 
ods employed in the present investigation is securing two inde- 
pendent examination grades for each pupil for the same units of 
work. This requires that two examinations be given to each of the 
groups of students from which data were secured. Two methods 
were used. These are described in the following directions which 
were sent to those cooperating in this investigation. 


Meruop I 


Two sets of examination questions are to be prepared by a 
single person, or two or more persons working together. Each 
of the two lists should contain the same number of questions. 
There should be a distinct effort to make the two lists approxi- 
mately equal in difficulty and as nearly as possible similar in re- 
spect to the type of questions. 

After the two lists of questions have been made both should 
be given by each teacher to all of her pupils under as nearly the 
same conditions as possible. If not given on the same day, the two 
examinations should be given within a period of one week. For 
example, if two sets of examination questions in seventh grade 
geography have been prepared, both sets of questions should be 
given by each seventh grade teacher to all of her pupils. 

Each teacher is to mark both sets of examination papers for 
her pupils. In marking these papers the teacher should indicate 
the credit given for each question and write the total grade plainly 
upon the examination paper. When two or more teachers have 
given the same examinations It 1s not necessary that they confer 
in regard to the marking of the papers. If this is done a memor- 
andum regarding the procedure should be attached to the exam- 


ination papers. 
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This method may be followed by a single teacher who has 
two or more séctions of a given subject. Both examinations should 
be given to all sections taught by this teacher. This method. of 
studying the reliability of written examinations can be applied 
to any school subject. The Bureau of Educational Research is 
most interested in having it applied to arithmetic, history, geogra- 
phy, and language in the elementary school and to history, math- 
ematics, English, and science in the high school. 


Meruop II 


Two sets of examination questions for the same subject are 
to be prepared by two teachers working independently, each 
teacher preparing a set. There is no requirement concerning the 
length or the difficulty of the two sets of examination questions 
except that both should cover the same amount of work. The 
teachers who prepare the questions should not confer concerning 
either the kind or the number of questions asked. 

After the questions have been prepared, both sets are to be 
given by each teacher to all of her pupils. If not given on the 
same day, the two examinations should be given within a period of 
one week. 

After the examinations have been given each teacher will 
grade all of the papers written upon the questions that she pre- 
pared. This will mean that she will grade a set of papers for her 
own pupils and also a set for the pupils of the other teacher. 
There should be no conferring between the teachers in regard to the 
method of scoring. In marking these papers the teacher should in- 
dicate the credit given for each question and write the total grade 
plainly upon the examination papers. 


The data collected. Through the city superintendents and 
high:school principals a general invitation was extended to school 
systems in Illinois to participate in this investigation at the close 
of the second semester 1921-22 and also at the close of the first 
semester 1922-23. No instructions other than those just noted 
were given to those who cooperated. It should, therefore, be 
borne in mind that the data collected are for written examinations 
as they are usually given and not for special types of examinations 
or for unusual methods in the administration or the grading of the 
test papers. The reliability of examination grades could probably 
have been increased if certain directions had been formulated in 
regard to the marking of the examination papers but the purpose 
of the investigation was to determine the reliability of typical 
written examinations administered in the usual way. 

Returns were secured from seventy-two groups of children 
but it was necessary to discard the data for six groups because 
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) 
instructions had not been followed. The examinations given to the 


sixty-six groups were all of the traditional type. The papers were 
marked by the teachers on the scale of 100 percent, and were then 
sent to the Bureau of Educational Research. The coefficients of 
reliability reported in this chapter were calculated under the di- 
rection of the writers. 

Coefficients of reliability of examination grades. The coeffi- 
cients of reliability of written examinations for the sixty-six 
groups of students are summarized in Table III. This table also 
shows the number of students in each class, the number of ques- 
tions in each examination and the method followed in giving the 
examination. The reliability coefficients have been grouped by 
subjects and have been arranged in descending order of magni- 
tude. For those entries marked with an asterisk (*) in the column 
headed “‘Method,” one of the examinations was given by the prin- 
cipal or some other person not actually teaching a class in the sub- 
ject at that time. However, this person was considered competent 
to be in charge of the examination. The total distribution of re- 
liability coefficients is given in Table IV. 

Two of the coefficients of reliability are negative. The high- 
est is .95. It is interesting to note that the coefficients given for 
history are, on the average, higher than those obtained for arith- 
metic. The most reliable examinations given were in algebra. 
With the exception of history, arithmetic, and algebra, the num- 
ber of groups is so small that comparisons can not have much 
significance. The median coefficient of reliability .65 may be 
used as a general index of the reliability of written examinations. 

The coefficients of reliability of standardized educational 
tests. McCall? has stated that the “range of self-correlation for 
many standardized tests is about .5 to about .9.” The writer’s 
experience has indicated a somewhat greater range. In Table V 
the reliability of a number of standardized educational tests is 
given. Those for the silent reading tests by Brown, Starch and 
Courtis are taken from a recent bulletin’ by the writer. The range 
in this table is from .19 to .92. 


2McCall, W. A. How to Measure in Education. New York: The Macmillan 


Company, 1922, p. 396. ies 
3Monroe, Walter S. “‘A critical study of certain silent reading tests.” University 


of Illinois Bulletin, Vol. 19, No. 22, Bureau of Educational Research Bulletin No. 8. 
Urbana: University of Illinois, 1922, p.33-34. 
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TABLE III. COEFFICIENT OF RELIABILITY FOR WRITTEN EXAMINA- 
TIONS SET BY TEACHERS 


Class No. of No. of Coefficient 
Number Pupils Questions | Correlation Method 
64 (Arithmetic)........... 21 10 76 Il 
So ae Re es 35 7 and 8 74 II 
43 oN die ee PN ite ore es 24 10 W3 II 
63 AT th 1S. NG ee ate 38 10 71 II 
61 ES tog Waa se cere 64 10 69 II 
2 peter a oes 37 5 67 I 
62 aM Seiciocteleras 38 10 64 I 
ff ee JONES ison res 88 10 64 I 
60 nee es eh ee eee 41 5 and 10 61 II 
65 Has Tee Os 33 10 60 II 
4 SE RRP ae tes, Reet: Ife 6 56 I 
TA BSoigt mean se aarereete 17 10 48 = 
44 ae en Ron aee 22 7 and 10 48 II 
3 pie ae NUE AE nce Ce 55 6 47 I 
66 Ha) Spee a mae 73 10 47 II 
23 i cain SSRN Orrin Re Ae 27 5 35 I 
SZ SAGE eaten chin tree 56 5 30 II 
59 ik Lao 21 6 and 10 29 I 
45 TED ror oer CS 43 10 .06 II 
42 Seq EM ES eatin, 27 8 and 10 ==, Us II 
See (Algebra) eran seeps 51 10 292 I 
72 Cie Mestre Oy hore 20 30 and 19 wil “ 
50 oN atiare Area 23 5 and 7 88 II 
34 Rollo ia: aaa ae Oy, 10 and 6 .82 I 
a7 Sul kakashi ee 36 10 81 II 
6 Shs w AB citsctsnaas antietee 52 6 .78 — 
35 RS ake ee ara 45 15 sis II 
3! i a eR aid 54 6 61 Il 
7a (uanguace) nna 33 5 68 — 
9 SERS EN Ore 54 5 .62 — 
8 Mn Ne Shans Wale SY 53 5 Si — 
LOK (Biterature) ieee rene 74 5 ai I 
T3mee(Enelish) Seeemer ree 15 3 .78 J 
12 WES aA oe 43 5 nos II 
11 eo ik eek ee 27 5 O33 I 
54 PL isn 5 ot ote 37 7 and 45 .50 II 
53 hon Urey DAI 6 and 7 47 II 
13 2 Se LE Sener 52 5 and 8 41 II 
SS see ere eee rl ee eo ea 
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TABLE III. (Continvep) COEFFICIENT OF RELIABILITY FOR WRITTEN 
EXAMINATIONS SET BY TEACHERS 


Class No. of No. of Coefficient 


Number Pupils Questions | Correlation Method 
20= (History): ..cc2.ss0.- 14 5 95 = 
15 eg alibi s OR 28 5 .85 I 
23 a ee 19 5 ahs J 
36 Le Penna 43 5 .76 II 
56 re Re A: 19 9 S15 * 
67 a as EES 53 5 si) II 
41 SOS By ret toe Fe 32 8 and 10 .67 II 
40 ie ae 28 5 and 7 .66 II 
14 See lg Seer Sic eeeneace 64 5 63 I 
48 ae Jae Cnn te 24 10 soe Il 
49 oar ene 30 § and 10 B55 II 
39a(Geopraphy) se--2: <=: 29 10 66 II 
46 See > age yee ee 29 5 and 10 .66 Il 
47 ate tes 47 5 and 10 .62 II 
16 alae Tis ep 21 5 43 | 
68 Oe ne ety ae 23 10 and 8 aD, i 
38 Noe re ee a 26 10 and 5 eel II 
lyme Civics) miles oct es 63 5 39 I 
Siieee (Latin hewn. 31 8 .89 II 
19 ie, nS Caine sc gee 30 2 .82 I 
51 PE Re Rg caine ret 2 50 6 and 8 68 II 
Pile (Spelling) pres cee. 2: 33 5 .87 I 
Some Geometry) nase ees 42 5 5S Il 
70 Cee cS 10 8 .68 3 
22 MEL * OP Seah seer omer 46 4and 5 34 _— 
Wk (GPE) spsocouece 18 5 aT) I 
Sei (Gane) Aoweneposs 23 7 and 8 .83 I 
69 (Commerce)<.«.-...--- 16 16 and 10 553 we 


ee 
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TABLE IV. SUMMARY DISTRIBUTION OF COEFFICIENTS OF RELIA- 
BILITY FOR WRITTEN ‘EXAMINATIONS 


a oom 


Size of Coefficient Frequency 
of Correlation 


(o>) 
Nn 
KH ReODORK OR OCF PRE NUP POO LO PNR 


From certain unpublished studies by the writer the follow- 
ing information has been obtained. The Courtis Standard Re- 
search Test, Series B, Forms 1 and 2 were given to pupils as fol- 
lows: Grade V, 89; Grade VI, 81; Grade VII, 52; and Grade VIII, 
38. The thirty-two coefficients of reliability ranged from .409 
to .904 with the median at .665. Forms 1 and 3 were given to a 
slightly larger group in each of the four grades. The thirty-two 
coefficients of correlation between the two sets of scores for this 
administration of Series B ranged from .528 to .963 with the 
median at .704. The Woody Arithmetic Scales, Series A, were 
given to several groups of pupils. Two scores were secured by 
using alternate items of each of the scales and applying Brown’s 
formula.4 The twelve coefficients of reliability computed in this 
way ranged from .91 to .46 with the average at .66. Forms 1 and 


4 es 2rh_ In this formula ry, is the correlation between two scores which this 
I+rh test yields. One is based upon reproduction and the other upon 
answers to questions. 
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TABLE V. RELIABILITY COEFFICIENTS OF STANDARDIZED EDUCA- 
TIONAL TESTS 


Test Coefficient 
Pilmoows General. Intelligence Scale’. cect cx s.wacecsvee.ccsccwssensn 92 
Courtis Standard Research Tests, Series BE....... 0... cc ceccccccecces 87 
Brown Silent Reading” Fest—Ratets ecules) ocevs valk c davndec ca. 86 
Courtis Silent Reading Test, No. 2—Rate..............ccecceceeees 85 
ets Groeprintelivence scale S.e 2 nate fool ciate oes oe cs eas 84 
Monroe Standardized Silent Reading Test Revised*—Rate............ 84 
Courtis Silent Reading Test, No. 2—Comprehension—No. Quest...... 80 
Starch Silent Reading Test—Comprehension—Words................ 77 
Monroe General Survey Scale in Arithmetic*....0.............00000 .76 
Monroe Standardized Silent Reading Test Revised*—Comprehension. . .76 
Monroe Standardized Silent Reading Test Revised*—Rate............ a) 
Monroe Standardized Silent Reading Test Revised*—Comprehension. . av. 
Starch Silent Reading Test—Comprehension—Ideas................. =O 
adianaeAtcamment scales NOt! Pees nan ir ere te ree ety a 66 
StarehksilenesReading eles t-—Rate jase oe tse aleni qc eiaiinesie ra at ae .62 
TPRegyy Barents Svea). gcc acini cs Ba ee ee .59 
Courtis Silent Reading Test, No. 2—Comprehension—Index.......... 58 
Rresseys ins taGradesVocabularygoca lets jam eur eee aie etree leis cies .37 
Brown Silent Reading Test—Comprehension—Quantity............... .36 
Riesseyse rier ocale eee eee Serre me at ena, ene Nn are ee 733 
Brown Silent Reading Test—Comprehension—Quality..,............. 5) 


*Monroe, Walter S. ‘‘The Illinois Examination.’”’ University of Illinois Bulletin, Vol. 19, No. 
9, Bureau of Educational Research Bulletin No. 6. Urbana: University of Illinois, 1921, p. 47. 

Pressey, W. Group Scale of Intelligence for Use in the First Three Grades: its validity 
and reliability,’? Journal of Educational Research, 1:285-94, April, 1920. 

7fUnpublished data of the Bureau of Educational Research, University of Illinois. 

Colvin, S. S. “Some recent results obtained from the Otis Group Intelligence Scale,’’ Journal 
of Educational Research, 3:1-12, January, 1921. 


2 of Monroe’s Standardized Reasoning Test in Arithmetic were 
given to pupils as follows: Grade V, 36; Grade VI, 92; Grade VII, 
76; Grade VIII, 81. The coefficients of reliability for correct 
principle were as follows: .530, .630, .645, and .723; for correct 
answer they were .518, .528, .576, and .707. Using Brown’s 
formula the coefficients of reliability for Gray’s Silent Reading 
Tests were computed for thirty grade groups. These coefficients 
ranged from .55 to .85 with the median at .67. The number of 
pupils per group was less than 100 in only five cases. For several 
erade groups reliability coefficients were secured for Monroe’s 
Standardized Silent Reading Tests which ranged from .222 to .907 
with an average of .669. 

Haggerty has computed the reliability for both Sigma 1 and 
Sigma 3 of his Reading Examination by having the same test re- 
peated. In the case of Sigma | the interval between the two appli- 
cations of the test was six weeks. For 200 children in Grades I to 
III inclusive the coefficient of reliability .84 was obtained. In the 
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case of Sigma 3 the interval between the two applications was only 
two days. For 126 pupils from Grades V to VIII, inclusive, the 
coefficient of reliability was found to be .885. For the sentence 
test alone the reliability coefficient was .769 and for the paragraph 
test, .806. For Thorndike’s Scale Alpha for the Understanding of 
Sentences, McCall has reported a coefficient of reliability of .37. 
This was obtained by using a test similar to Alpha but not con- 
sidered a duplicate form. Gates® reported reliability coefficients 
for Thorndike-McCall Reading Scale which ranged from .25 to .72. 
All of these were for pupils belonging to a single grade. For the 
Burgess Picture Supplement Scale the author has given coeffi- 
cients of reliability ranging from .62 to .88 for grade groups from 
the second to sixth grades inclusive. In each case the number of 
pupils was relatively small. Gates gave coefficients of .62, .59 and 
.66 for three grade groups. 

For the Otis Self-Administering Test of Mental Ability the 
author has reported an average reliability coefficient of .921 for 
the higher examination and of .948 for the intermediate examina- 
tion. Presumably these coefficients are based on the scores se- 
cured from pupils for a sequence of several grades. For the 
separate tests of the Stanford Achievement Test the authors re- 
ported coefficients of reliability based upon separate grade groups 
which ranged from .75 to .96. When the composite score of all the 
tests was used the reliability coefficient was .98. 

The relative reliability of written examinations and stand- 
ardized educational tests. The data which have just been sub- 
mitted indicate that the difference between the reliability of the 
two types of instruments is not as great as is commonly believed. 
The median of the reliability coefficient for written examinations 
given in Table IV is .65. There are many reliability coefficients 
for standardized tests in Table V which are less than this. Further- 
more, the additional citations of coefficients of correlation in the 
above paragraphs indicate that for a number of standardized edu- 
cational tests which have been very widely used the median of the 
reliability coefficients for grade groups is in the neighborhood of 
.65. Thus the conclusion seems justified that altho some of our 
more elaborate standardized tests, such as the Stanford Achieve- 


‘Gates, Arthur I. “An experimental statistical study of reading tests,” Journal 
of Educational Psychology, 12:379, October, 1921. 
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ment Test, the Illinois General Intelligence Scale, and the Otis 
Self-Administering Test of Mental Ability, may be expected to 
yield measures whose reliability is greatly in excess of that of 
typical written examinations, many widely used standardized 
educational tests yield measures which possess about the same 
degree of reliability as the grades obtained from written exam- 
inations prepared by teachers and other school officials. It 
should be noted that reliability refers only to the variable errors 
of measurement. The constant errors as we shall show, (p. 40) 
are likely to be very much larger in ‘examination grades than in 
the scores yielded by standardized educational tests. It should 
also be noted that the time required to give many of the stand- 
ardized tests is much less than that devoted to a typical written 
examination. 

The absolute reliability of examination grades. The state- 
ment that the reliability of a typical examination is equivalent to 
that of many standardized tests and only slightly less than that of 
a number of others still leaves a doubt with reference to the abso- 
lute reliability. For practical purposes the reliability coefficient 
of .65 needs to be interpreted in terms of the variable errors of 
measurement to be expected. The correlation tables for eight 
groups having a reliability coefficient of approximately .65 were 
taken and the scores translated into a five point system of school 
grades. It is assumed that these classes were typical and the high- 
est scores were translated into a mark of “‘A,” the lowest into a 
mark of “E.” This was done in an arbitrary way but the results 
indicate roughly one meaning which may be attached to a re- 
liability coefficient of .65. It was found that in 40 percent of the 
cases the students received the same grade in the two examina- 
tions. In an additional 42 percent the grade which they received 
on the first examination was only one point higher or lower than 
that received on the second. For example, if a student in this 
group made a “D” on one examination, he made an “E” or “C” 
on the other. The two grades received by the remaining 18 
percent differed by two points or more. 

Conditions tending to produce variable errors of measure- 
ment in examination grades. Several sets of examination papers 
were examined in order to ascertain the conditions which tended 
to produce the lowest coefficients of reliability and hence the 
largest variable errors of measurement. The most potent cause 
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appeared to be that the two teachers recognized widely different 
educational objectives in making out the two sets of examination 
questions. This seemed to be the case in Group 42, arithmetic, 
for which the coefficient of correlation was —.18. In Group 22, 
geometry, there was a difference in the general plan of the exami- 
nations; one teacher permitted the students to choose one of two 
questions in part of the examination while the other required that 
all questions be answered. This difference in the plan of the ex- 
amination appeared to increase the variable errors of measure- 
ment. There was also a difference in the educational objectives 
recognized in that one teacher placed much more emphasis upon 
the practical application of geometry than the other. 

Another cause which operated to lower the degree of correla- 
tion and hence to increase the magnitude of the variable error was 
the adherence to different standards of excellence by the teachers 
who graded the papers. For example, in Group 45, arithmetic, 
one teacher considered only the final answer to the exercise; if that 
was right the student received full credit—if wrong, no credit was 
given. The other teacher gave credit for correct principle. The 
coefficient of reliability for this group was .06. 

It was noticed that in general pupils made higher grades on 
the tests set by their own teacher than on those set by another 
person. This appeared to be true even when distinct differences 
could not be identified either in the educational objectives or in 
the methods of grading of the two teachers. In Group 32 for which 
a reliability coefficient of .30 was obtained when the grades made 
on the first examination were correlated with those made on the 
second examination, a second coefficient was calculated by com- 
paring the student’s grade made on the examination prepared by 
his own teacher with that set by another teacher. This procedure 
gave a coefficient of correlation of .40. When the two classes were 
taken separately coefficients of .57 and .44 were obtained. These 
data tend to supplement the evidence already cited. that differ- 
ences in the content of the examination and in the plan of marking 
are potent factors in producing the variable errors of measurement. 

The magnitude of constant errors in examination grades. 
It is probable that most of the teachers marking the examination 
papers did not recognize the distinction between ‘‘scores” and 
“grades’’® and that the marks placed upon the papers were con- 


®See page 11 for a statement of this distinction. 
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TABLE VI. DISTRIBUTION OF DIFFERENCES BETWEE 
OF EXAMINATION GRADES DESTEROSES 


Difference Frequency 


o 
+ 
er 
Se 2 
Na WODUWKDRAWRHE NRE NN KB WN Ree eee 


= 
ao 
a 
= 
= 
a 


> 


sidered as “grades.” In several instances the “grades”? made on 
one examination were on the average much higher than those 
made on the other. If “‘scores” were used as “grades,” any differ- 
ences between the averages of the two sets of measures indicate 
the presence of constant errors. In order to secure an index of 
their magnitude the differences were calculated for the sixty-six 
groups to which two examinations were given. These are assem- 
bled in Table VI. For three of these groups the difference between 
the averages of the two sets of “grades” was zero; for eight other 
groups it was one. At the other extreme we find a difference of 50 
in the case of one group. The median difference is 6.2. 

It should be noted that the differences between the averages 
of two sets of examination grades are not constant errors. They 
are merely indicative of the presence of constant errors. If one 
examination was easy and the other hard the difference would be 
the sum of a positive error and a negative error. If both examina- 
tions were hard the difference would be smaller than the constant 
error in either average. The large differences shown in Table VI 
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are probably caused by the combination of an easy examination 
with a difficult one. This was very obviously true in the case of 
the one difference of 50. Furthermore, in interpreting Table VI it 
should be remembered that possibly some of the teachers recog- 
nized the distinction between “scores” and “grades,’”’ and the 
marks would have been appropriately adjusted before being used 
as “grades.” 

So far as it was possible to ascertain from an analysis of the 
examination papers the large differences are due to two causes— 
differences in the difficulty of the two sets of examination ques- 
tions and in the severity of the grading. For example one of the 
examinations which produced a difference of 40 consisted of seven 
questions of which the pupils were permitted to answer any five. 
These questions were relatively easy. In the other examination, 
there were ten questions and the pupils were required to answer 
all of them. Very few were able to complete this second examina- 
tion in the time allowed and the teacher appears to have counted 
the unfinished exercises as failures. Nine out of twenty-two child- 
ren in the second group made zero on the examination. In this 
way avery large constant error was introduced but the coefficient 
of reliability for this group was .48. 


Relative magnitude of constant errors in examination grades 
and in standardized test scores. In another place’ the writer has 
discussed the magnitude of the constant errors in educational tests. 
In cases where there has been coaching for tests, intentional or not, 
or disregard for standard directions, large constant errors have 
been introduced. In one extreme instance a constant error of over 
three and a half years occurred in the mental age scores of a group of 
children. In general, however, because of the standard directions 
for administering the tests and scoring the papers, of the objectiv- 
ity of the marking, and of the norms for interpreting test scores, 
the constant errors in standardized tests are very much smaller, 
and are likely always to be smaller than those found in examina- 
tions given by teachers. However, some reduction in the magni- 
tude of the constant errors in examination scores will result when 
the use of either very easy or very difficult sets of questions is 
avoided and when a conservative plan of marking is followed. 


™onroe, Walter S. ‘The constant and variable errors of educational measure- 
ments.” University of Illinois Bulletin, Vol. 21, No. 10, Bureau of Educational Re- 
search Bulletin No. 15. Urbana: University of Illinois, 1923, p.19-20. 
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Explanation of the apparent contradiction between the re- 
sults of this investigation and previous studies of examination 
grades. The results of this investigation have caused the wiiters 
to revise their estimate of the accuracy of examination grades. 
The findings indicate that the errors are much less than they ap- 
peared to be from evidence resulting from investigations of the 
Starch-Elliott type. One naturally asks the question, “Why this 
apparent contradiction?” Starch and Elliott obtained similar re- 
sults for three different examination papers and numerous other 
investigators have corroborated their findings. The mass of evi- 
dence accumulated is so extensive and uniform in character that 
one would naturally be inclined to accept the conclusions indicated 
in preference to the apparent contradictory results of the present 
investigation. However, a careful analysis of the procedures re- 
veals that the results are not necessarily contradictory. The 
method followed by Starch and Elliott combines both constant 
errors and variable errors. The “grades’’ assigned to the exami- 
nation paper in geometry were influenced both by the subjectivity 
of the marking and by the tendency of some teachers to grade high 
and of others to grade low. The present investigation has separat- 
ed the variable errors from the constant. It has also shown that 
the examination scores have in some cases involved relatively 
large constant errors. The extreme differences between the grades 
assigned to the same paper reported by Starch and Elliott (see 
page 9) are easily explained when it is understood that they repre- 
sent the combination of variable errors and constant errors. 
Especially is this true when we realize that the constant errors 
would likely be larger for teachers of different schools as in their 
investigation than for teachers in the same school as in the present 
investigation. 

Conclusion with reference to relative accuracy of examination 
grades and scores yielded by standardized tests. As already 
indicated the writers believe that the data presented in this 
chapter show that examination grades are more accurate meas- 
ures of achievement than many persons have considered them to 
be. Standardized tests yield scores involving errors, both con- 
stant and variable, but in the case of our best standardized tests 
these errors are distinctly less than the corresponding errors in ex- 
amination grades. Furthermore, measurement by means of 
standardized tests usually requires much less time than is com- 
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monly devoted to written examinations. This conclusion refers 
to written examinations of the traditional type and admin- 
istered under typical conditions. It is likely that written examina- 
tions and their administration may be improved so that the 
difference in the accuracy of examination grades and test scores 
will become much less than at present.8 


*The conditions of standardized tests would have been more closely approximated 
if both sets of examination questions had been prepared by the same person and marked 
by different persons. If this had been done it is reasonable to expect that the coeffi- 
cients of reliability would have been somewhat higher and the differences in the aver- 
ages of the two sets of scores smaller. 
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CHAPTER V 
THE CONTENT OF WRITTEN EXAMINATIONS 


The data collected. In response to an invitation sent to 
superintendents and high-school principals in Illinois sets of ex- 
aminations were received from fifty-six schools for the first semes- 
ter and from fifty schools for the second semester of the school 
year 1921-22. A range of approximately sixty subjects was repre- 
sented. It seemed desirable to restrict this analysis of sets of 
questions to the thirteen subjects listed in Table VII. The num- 
ber of sets of questions and also the total number of questions are 
given in this table. 

Classification of questions according to type. After consider- 
able experimentation a list of fifty types of questions as given be- 
low was formulated. 


Aims 

Analysis 

Cause (give) 

Classification 

Comparison 

Completion 

Conjugation 

Construction (a figure, study 
or statement) 

Construction (give the) 

Contrast (general) 

Contrast (specific basis) 

Correction 

Criticism 

Decision (choice or preference) 

Declension 

Definition 

Description (characterization) 

Diagram (illustrate by) 

Discussion 

Effect (give the) 

Evaluation 

Example (illustrate by) 

Expansion 

Explanation (tell why or how) 

Facts (definite number) 

Facts (indefinite number) 

Factoring 


How many (tell) 

Law (give the) 

Mathematical operations of addition, 
subtraction, multiplication, and divi- 
sion 

Method 

Outline 

Parsing 

Proof 

Punctuation (capitalize and 
correct sentences) 

Recall 

Reduction to lowest terms 

Relationships (give the) 

Rule 

Scanning 

Simplification 

Source 

Substitution (values for letters) 

Summary 

Solving for unknown quantity 

Syllabus 

Translation (foreign language to 
English) 

Translation (English to foreign 
language) 

Use (give the) 

Where (tell) 
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TABLE VII. NUMBER OF QUESTIONS AND SETS OF QUESTIONS 


EXAMINED 
Se  ————————————————————————————————————_—_______ IIE 
Subject Sets Questions 

Enelish' Tesco ecmcks chase mioarts 80 ypAl 
English |b ere a en Sree netics 83 694 
Englishetl Uimeee caer cine eer 79 72 
(Algebtatltmmsrctsc see eine ssi 80 731 
Plane‘\Geomettymernsme rome tee 80 636 
Latin: Treen co tee Su ein aise 81 683 
(atin [voter ct cee neta: 76 539 
Physics Seen ee cee cota ntiainteryac coe 76 795 
General'Sciencets as sascnetes tee Pee 62 789 
GivicScoe es eo oa ne Seen ee 59 560 
Americanvbls tory teeetett tere te 62 550 
DomesticiscienCes-aceurcte. aes 42 392 
Domestic Artec verec restorers 41 368 

Wh Otal’ sues eitiec tse tees 901 7621 


All questions for the thirteen subjects mentioned in Table VII 
were classified under some one of these types. This classification 
was made by Mr. Souders with the assistance of a single clerk 
working under his immediate direction. Altho any classification 
of this kind is necessarily subjective, a relatively high degree of 
uniformity has, we believe, been secured. 

Summary of classification. Twenty-six of the fifty types of 
questions were represented in six or more of the thirteen subjects. 
The relative frequency of each is given in Table VIII. This classi- 
fication of examination questions shows a high frequency of cer- 
tain types and very little or no use of a number of other types. 
If we omit Latin, Algebra, and Plane Geometry in which the na- 
ture of the subject-matter restricts the kind of question asked, 
we find that 32 percent of all the questions require “explanation.” 
The next most frequent type used, 21 percent, calls for a “definite 
number of facts.”’ 

Frequently all questions are considered as belonging to one of 
two groups, “thought questions” or “memory questions.”’ Such a 
definite classification is not, however, always possible. The 
character of the mental process involved in answering depends 
upon the person replying as well as upon the form of the question 
asked. Those questions calling for definite facts are almost cer- 
tain to be based upon memory; on the other hand, those requiring 
classification, evaluation, contrast, etc. are likely to demand 
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thought on the part of most students. If, however, such classifi- 
cations or evaluations have been made ina previous class exercise 
some students may easily remember the answers and, in such a 
case, a thought question for one student becomes a memory ques- 
tion for another. For the purpose of this study Types 10, 20, 22, 
25, and 26 have been designated as probable memory questions, 
the remaining types as probable thought questions. The percent 
of each group is given in the last two lines of Table VIII. These 
percents can be considered as only a rough indication of the re- 
lative frequency of these two very general divisions. 

In her investigation of “the question as a measure of effi- 
ciency in instruction,’ Dr. Stevens! attempted to determine 
the relative number of thought questions and memory questions 
asked by teachers in a single class period. The percents of memory 
questions for history, English and science were 83, 55, and 67 
respectively. This relative frequency is much larger than indi- 
cated in Table VIII. The difference may be due to the fact that 
in the present investigation only written examination questions 
were considered, but it is altogether likely that it is indicative of a 
real change in the type of questions which teachers commonly ask 
of their students. 


Relation of questions to educational objectives. The ques- 
tions which teachers ask during class periods constitute a concrete 
expression of the educational objectives which they are day by 
day setting for their students. The questions of the final examina- 
tions should, therefore, be representative of the types of education- 
al objectives set in the different school subjects.2, The emphasis 
upon memory and some of the simpler types suggests a need for 
a modification in emphasis in most of the school subjects. 


Quality of examination questions. Altho the writers have no 
objective evidence to present in regard to the quality of examina- 
tion questions, those submitted for this study were in general 
considered good. Catch questions or those stated so that they 
would not be understood easily by students were very rare. Many 
questions were stated so that the grading of the answers was ob- 
jective and would indicate that their form had been influenced by 


Stevens, Romiett. “The question as a measure of efficiency in instruction.’ 
Teachers College Contributions to Bstueeeon, No. 48. New York: Teachers eaten 
Columbia University, 1912. 


*See page 55 for a further discussion of Rae in school subjects. 
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case of the examinations studied in this investigation. 
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CHAPTER VI 
THE IMPROVEMENT OF WRITTEN EXAMINATIONS 


Altho we now have a number of standardized tests which are 
superior to written examinations, and we have reason to believe 
that they will be used even more extensively than at present, there 
is need to give attention to the improvement of written examina- 
tions. It does not appear likely that standardized tests will ever 
replace written examinations. The latter type of measuring in- 
strument will probably continue to be the most frequently used 
means of measuring the achievements of school children. 

Written examinations may be improved by correcting the 
faults which have been noted in the preceding chapter. In this 
chapter we shall consider four important improvements: (1) 
Reduction of constant errors; (2) Reduction of variable errors; 
(3) Securing a greater agreement of the content of examinations 
with recognized educational objectives; (4) Simplification of the 
administration of written examinations. 

There is some overlapping between these improvements. 
For example, the magnitude of errors in measurement, particularly 
variable errors of measurement, will be reduced by securing a 
greater agreement between the content of the examination and 
recognized educational objectives. The devices for simplifying the 
administration of examinations also tend to make the results more 
accurate. 

Causes of constant errors in examination grades. The fun- 
damental cause of constant errors in examination grades, i.e., 
“high grades” or “low grades,” is the failure to recognize the dis- 
tinction between “scores” and “grades.” (See page 11.) A 
pupil’s grade tells his standing with reference to a norm, i.e., the 
passing mark. When no distinction is made between ‘“‘scores” 
and ““grades” this norm is subjective. Altho the passing mark may 
be defined numerically as 70 percent or 85 percent it is fixed in the 
case of a particular examination by the difficulty of the questions 
and by the severity of the marking of the papers. Pupils will re- 
ceive “high grades” when the examination is easy or the plan of 
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marking is generous. They will receive “low grades” when the 
examination is hard and a severe plan of marking is followed. 
If the teacher makes no distinction between “scores” and “orades” 
he sets the norm for a particular examination when he makes out 
the questions and decides upon the plan of marking. He implicitly 
expresses the opinion that the pupil whose achievements are barely 
“passing” will make a grade of 70, or the passing mark adopted. 
He also implies that the pupil whose achievements are exception- 
ally high will make a high grade, i.e., a grade of 95 or between 95 
and 100. Such expressions are merely subjective. 

Since the failure to recognize the distinction between “scores” 
and “grades” is the cause of constant errors the plan for improve- 
ment is obvious. The papers should be marked in terms of 
“scores.” These may be on the scale of 100 but this is not essen- 
tial. In fact it will probably assist a teacher in keeping the dis- 
tinction in mind if the scores are not on the scale of 100. After the 
papers have been marked the “scores” should be translated into 
“grades” by comparison with a norm in which the subjective ele- 
ments are reduced to a minimum. 

A standard average grade used as a norm. The simplest ob- 
jective! norm is a standard average grade. This may be set 
arbitrarily but a more rational procedure would be to take the 
average of the grades given in a school on a particular subject 
during a period of several years. 

The standard average grade defines the grade into which the 
average score of a typical class should be translated. For ex- 
ample, if the standard average grade is 85 and the average score in 
a particular class is 57 the grade corresponding to this score would 
be 85. In case the class is made up of poor students the average 
grade of the class should be below the standard average grade. 
If the class is unusually bright their average grade should be 
higher than the standard average. The translation of the average 
score into the appropriate corresponding grade furnishes a basis 
for the translation of the other scores of the group. 

The procedure just outlined is necessarily crude. It is par- 
tially subjective because the determination of the general status of 


1The adjective “objective” is not intended to indicate perfect objectivity or even 
as high a degree of objectivity as we have in the case of many standardized tests. As 
used here it means that the norm is distinctly less subjective than the norm commonly 
implied in the usual examination. 
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the class is left to the teacher. However, the teacher may use 
previous school records or the measures obtained from a stand- 
ardized test to assist him in arriving at a partially objective es- 
timate of the general status of the class. The use of a standard 
distribution instead of merely a standard average grade represents 
a more systematic procedure. 

1. Decreasing the magnitude of constant errors by means of 
a standard distribution of grades. For several years a number of 
educators have been urging that teachers make the distributions 
of their grades conform to a standard shape, i.e., that a specified 
percent of the members of a typical class be given a grade of A, 
another specified percent a grade of E, and so on for each of the 
marks adopted by the school.2 A number of distributions have 
been recommended. For a five point system of grades several 
authors have recommended the following distribution, 7, 24, 38, 
24, 7. Other distributions which have been advocated are 7, 18, 
SOMTSie/dand 55.15 SO0R 15.5: 

The essential feature of the plan is a specification of the per- 
cent of the students of a given group who are to receive each mark 
rather than the particular form of distribution used. There is 
much evidence which indicates that the distribution of achieve- 
ments of an unselected group of students approximates the normal 
probability curve. If we assume that true measures of the 
achievements of an unselected group of 100 or more are distributed 
normally this adjustment does not fix the percent who are to re- 
ceive each grade. The normal probability curve may be divided 
in many ways, for example, it is possible to divide the curve so that 
there would be 50 percent of A’s, 20 percent of B’s, 10 percent of 
C’s, 10 percent of D’s, and 10 percent of E’s. In such a distribu- 
tion a grade of A would be given to all students above the average 
of the class. An appropriate meaning could be stated also for each 
of the grades. A distribution which is symmetrical has certain 
advantages and one of those mentioned in the preceding para- 
graph is to be preferred. The particular standard distribution to 
be used is a matter of policy which each school should determine. 
Some argue that different standard distributions be adopted for 


*If grades are expressed in percents the corresponding intervals such as 95 to 100, 
90 to 94, etc. would be used instead of A, B, C, etc. 


‘The normal probability curve is bell shaped and is symmetrical with the average 
or median as a center. 


[50] 


nA 


the different years of the high school, some advocate different 
standard distributions for different school subjects. It should be 
noted, however, that there are certain advantages in uniformity. 
It would be desirable for all high schools, particularly those in a 
given state, to agree upon a common standard distribution and to 
use this for all subjects. Grades assigned in different schools can 
have a common meaning only when they conform to the same 
standard distribution. 

The proposal that teachers make the distributions of their 
grades conform to a standard shape has met with much criticism. 
As in any controversy there have been extremists on both sides 
and many of those participating have given evidence that they 
failed to understand clearly the nature of the proposal of its es- 
sential features. Among the advocates of the use of a standard 
distribution are those who have insisted that the normal probabil- 
ity curve explicitly defines the students who must receive A’s, who 
must receive B’s, etc. Cases have been reported of instructors 
who frankly admitted that a certain student deserved to receive an 
A but that they had used up all the A’s which the distribution al- 
lowed, and that, therefore, the student must be satisfied with a B. 
One hears also of instructors who announce at the beginning of a 
course that a certain number of the class must fail. It is rumored 
that in some of these instances the students enrolled have hired 
certain other students who were indifferent to their scholastic 
standing to enter the course in order to provide the requisite num- 
ber of failures. The opponents of the use of a standard distribu- 
tion have contended that there was no a priori reason why any 
student should fail and that always the quality of the student’s 
work should determine his scholastic standing. Furthermore, they 
have pointed out that in any group of students brought together 
for instructional purposes it is extremely unlikely that the distri- 
bution of achievement would approximate at all closely their pre- 
determined standard distribution. The mechanical and unin- 
telligent application of a standard distribution by some instruc- 
tors has given the opponents of the plan concrete examples of what 
they imagined to be the normal result of its use. 

A standard distribution is merely a device which teachers 
may use in order to reduce to a minimum the constant errors in 
their grades, but to be helpful it must be used intelligently. It 
must be remembered that a standard distribution is a means and 
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not an end. Whenever common sense indicates that the distribu- 
tion of grades for a particular classshould depart from the normal 
distribution no instructor should hesitate to award the grades 
which he believes the students deserve. It is intended that the 
standard distribution will be closely approximated only for a large 
unselected group of students. A particular class very frequently 
is made up of a selected group of students. Furthermore, classes 
of the usual size, 20 to 35, are so small that frequently there will be 
significant departures from this standard distribution. 
Translating scores into school marks by means of a standard 
distribution. A standard distribution is useful in translating ex- 
amination “‘scores” into “grades.”’ The examination papers should 
be marked in terms of a score. This score may or may not be on 
»the scale of 100 points. In order to avoid confusion between 
“scores” and “grades” it is wise to use a scale of points shown so 
that the maximum score will not be 100. If the papers have been 
marked in this way the scores may be arranged in col- 
74 ia umns as indicated in the left hand margin. The first step 
in translating these scores into grades is to determine 
——— _- whether or not the class is typical. If an experienced 
69 teacher has had a class for several weeks he will usually 
be able to estimate its general status with a fair degree 
B oof accuracy. At the beginning of a school year or in the 
case of an inexperienced teacher some outside informa- 
——— tion is needed. The previous school record of the stu- 


58 dents may be studied but in many cases it will be more 
57 | convenient to administer a general intelligence test. The 
a Cc average mental age and the distribution of the I.Q.’s of 
iy the class will be a very reliable index of the composition 
51 | of the group. If the median I.Q. is distinctly below 100 


the teacher may know that he has poor pupil material. 
If itis much above 100 he knows that the class consists of 
46 pupils better than the average. If there is an unusually 
42 \p high number of low I.Q.’s he may expect a relatively 
4] high number of low grades. 

40 With the general status of the class in mind the 
~~. scores may be grouped in conformity with the system of 
38 marks used. In the illustration in the left hand margin 
35 -E it has been assumed that the class is approximately typi- 
cal. The percent of A’s and also the percent of failures 
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are somewhat larger than the percent specified in most standard 
distributions. If the scores are arranged in the form shown be- 
low the general shape of the distribution will be more obvious. 
However, in the majority of cases it will be sufficient to use the 
arrangement given in the margin. 


58 
57 
47 56 69 
Boa OS 


38 42 54 64 74 
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It is seldom that one will have exactly a symmetrical distri- 
bution of grades for a class of this size. Some departures from the 
standard distribution must be expected. In case the class is not 
typical one should expect marked departure from the standard 
distribution. For example, the distribution for a given class 
might be as shown below. In this there are no grades below passing 
but there are a number of poor students just above the passing 
mark. Also the percent of A’s and B’s is unusually large. Such a 
distribution is not normal but might well represent the distribution 
of grades for a particular class even when the normal distribution 
had been adopted as the standard. If the teacher is able to show 
that the general status of the class justifies such a departure he 
deserves commendation rather than criticism for his distribution. 


64 
42 63 
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An accumulative distribution used as a check upon constant 
errors. A standard distribution is also useful as a check upon the 
grades given by a teacher over a period of several terms. When the 
grades for the entire period are assembled in such a distribution 
any general tendency on the part of the teacher to give too high 
or too low grades will be revealed. Each teacher should keep an 
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accumulative distribution of the grades in each subject he teaches. 
For example, a teacher in mathematics should keep an accumula- 
tive distribution of the grades given in classes of first-year algebra. 
When the total number of grades becomes large a comparison of 
this distribution with the standard distribution will reveal any 
tendency on the part of the teacher to grade too high or too low 
in this subject. A teacher should then take steps to correct any 
marked departures from the practise defined by the standard dis- 
tribution. In large schools where there are several sections of the 
same subject it will be helpful to secure a distribution of grades 
each time they are issued. Any marked departures from the 
standard distribution will then be called to the attention of the 
teachers. However, one should avoid giving the impression that 
there must be uniformity with the standard distribution. De- 
partures from this standard distribution are justified when the 
group of pupils can be shown to be selected. Thus a departure 
from the standard distribution is a cause for an investigation on 
the part of the teachers concerned. If evidence can be produced 
which justifies the departure no change in the system of grading 
should be used. On the other hand when investigation reveals no 
reasons why there should be departures from the standard dis- 
tribution, the teachers should be urged to modify their system of 
grading so that a greater uniformity will be secured. 


2. Decreasing the variable errors in examination scores. 
The reduction of the magnitude of variable errors of measurement 
in examination scores is to be secured mainly through the adop- 
tion of rules which will bring about greater uniformity in preparing 
and administering examinations. These rules should include 
specifications in regard to the effect of poor writing, poor spelling, 
and poor English upon a student’s grade, and should be in agree- 
ment in regard to giving credit for correct principle and partial 
credit for exercises partly right or partly completed. The rules 
may properly include also specifications relating to the number and 
types of questions to be asked and the form in which they are to be 
presented to the students. For guidance in marking papers a 
teacher should write out, at least in abbreviated form, the cor- 
rect answers to the questions. The accuracy of examination scores 
will be increased also by making the examinations more uniform 
with respect to content. 


*For recommended rules covering these and other points see Chapter VII. 
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It has been proposed that the use of types of questions which 
call for answers that may be objectively classified as either “right” 
or “wrong,” would facilitate uniformity in marking the papers. 
This means of reducing the variable errors of measurement will 
be considered under the head of “simplifying the administration 
of written examinations.” 


3. Securing agreement of the content of examinations with 
recognized educational objectives. The intrinsic function of an 
examination is to measure certain achievements. In general the 
achievements for which we desire to secure measurements are those 
included in the recognized educational objectives. Hence, the 
questions should be in agreement with the objectives. Therefore, 
it is impossible to cover all details in a given subject-matter field. 
The questions should relate to the most significant facts, princi- 
ples, etc. of the course. Catch questions and those calling for un- 
important details have no place in an examination. For example, 
an examination in spelling should not include unusual or obsolete 
words, an examination in history should not call for obscure dates 
or other trivial facts. 

In securing agreement the teacher should make use of such 
terms of minimum essentials as are available. For example, in 
spelling a teacher may very properly select the test words from 
Ayres’ list of the one thousand most frequently used words or 
from some other carefully prepared minimum essential list. In 
geography a teacher will find the Hahn-Lackey Geography Scale 
a helpful source of questionings. In other subjects the teacher will 
not have access to terms of minimum essentials as complete as in 
these two subjects, but he should become familiar with curriculum 
studies and other investigations’ relating to educational objectives. 


5The following list is suggestive of studies relating to educational objectives: } 

Yearbooks of the National Society for the Study of Education. Bloomington, Illi- 
nois: Public School Publishing Company. : ots. 

Part I of 14th—reading, writing, spelling, language and grammar, arithmetic, history, 
literature, geography. P d i Ad : 

Part I of 16th—reading, writing, spelling, arithmetic, history, physical education. 

Part I of 17th—arithmetic, geography, reading, English, civics, history. 

Part II of 17th—history, civics, economics, sociology, geography. 

Part I of 19th—on new materials of instruction, reading, history, geography, mathe- 
matics, nature study, civics. : : 

Part I of 20th—on materials of instruction—all subjects in elementary schools, 

Part II of 22nd—the social studies in the elementary and secondary school. 

“Arithmetic, course of study for the elementary schools, including the kindergarten 
and the first six grades.”” Course of Study Monographs, Elementary Schools, 
No. 1 Berkeley, California: Public Schools, 1921. 86p. (Concluded on p. 56.) 
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In some subjects there are valuable committee reports which give 
the consensus of opinion concerning the relative importance of the 
numerous distributions. 

The teacher must assume most of the responsibility for se- 
curing the agreement between the content of the examination and 
educational objectives. In many of the high-school subjects he 
can obtain little assistance from such sources as just indicated. 
However, if this purpose is kept in mind and if he is really famil- 
iar with the subject which he is teaching, gross inconsistencies 
with recognized educational objectives will be avoided. 


4. Simplifying the administration of written examinations. 
The administration of written examinations, particularly the 
marking of the papers, can be greatly simplified by the use of 
certain types of exercises. For example, in the true-false type of 
exercise the pupil merely indicates whether the statement is true 
or false. Instead of asking the question, ‘““Why did the Puritans 
come to America in the seventeenth century?” we may ask 
whether the following statement is true or false. ““The Puritans 
came to America in the seventeenth century seeking wealth.” 
The pupil may give his answer to this exercise by writing a plus 
sign after the statement if he considers it true and a minus sign if 
he considers it false. In case the statement is dictated to him 
he may write after the number of the exercise the word “‘true” 
or “false” or the appropriate sign. The answering of such exer- 
cises requires very little of the pupil’s time and the scoring is ex- 
ceedingly simple. Questions which can be answered merely by 
“‘yes” or “no” also simplify the administration of examinations. 
Similar results can be secured with recognition exercises such as 
have been used in a number of standardized silent reading tests. 
The following is an exercise of this type. 


Ayres, L. P. “A measuring scale for ability in spelling.” N. Y.: Division of Education, 
Russell Sage Foundation, 1915. 58p. 

Ayres, L. P. ‘“‘Measuring scale for handwriting.” N. Y.: Division of Education, 
Russell Sage Foundation, 1920. (Folder, chart.) 

Bagley, W. C. and Rugg, H. O. “The content of American history as taught in the 
seventh and eighth grades.” University of Illinois Bulletin, Vol. 13, No.51. Urbana: 
University of Illinois. 

Charters, W. W. Curriculum Construction. N. Y.: Macmillan Co., 1923. 352p. 

Charters, W. W. and Miller, Edith. ‘A course in grammar.” University of Missouri 
Bulletin, Vol. I, Education Series 9. Columbus: University of Missouri, 1915. 

Hahn, H. H. Hahn-Lackey Geography Scale. Wayne, Nebraska: H. H. Hahn, State 

Hahn Heid, Scale for M Ability of 

ahn, H.H. Scale for Measurin ility of Children in History. Wayne : 
H. H. Hahn, State Normal Schack ‘ Meeps 
Moore, E. C. Minimum Course of Study. N. Y.: Macmillan, 1923. 402p. 
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“The first president of the United States was: Christo- 
pher Columbus, Benjamin Franklin, George Washington, 
Thomas Jefferson.” 


In answering this exercise the pupil is asked to underline or mark 
in some other way the name required to make a true sentence. 
Completion exercises in which pupils are asked to supply words 
which have been omitted furnish still another means of simplifi- 
cation. 


Directions for constructing a true-false examination.® 1. In 
constructing true-false exercises, a list of statements covering in 
some detail the portion of the subject on which the pupils are to be 
examined should be prepared. Some of the statements can then 
easily be changed so that they are false. The untruth of a state- 
ment should not be too obvious or it will be worthless for testing. 
Also statements should be selected which require an acquaintance 
with the subject in order to determine their truth or falsity. 

2. In a true-false examination the number of true statements 
should approximate the number of false statements, and the 
arrangement should be such that there is no regular sequence be- 
tween true statements and false statements. 

3. Since the pupil can give his responses very quickly, the 
examination should consist of not less than fifty statements. A 
true-false examination of one hundred statements can be given in 
the time usually devoted to an ordinary examination. 

4, The examination should be mimeographed or printed so 
that each pupil will have a copy. He may give his answers in the 
margins of the sheets, or, if it is desired to use the same set of 
papers with another group of pupils, he may be given a sheet of 
paper on which there are numbered blanks. The pupils will then 
be asked to record in the blanks their answers to the corresponding 
exercises. A less desirable plan, which may be followed when it is 
not possible to secure mimeographed copies of the examination, 1s 
to read the statements to the pupils and have them record their 
answers in numbered blanks. The disadvantage of this plan is 
that the pupils do not have a satisfactory opportunity to study the 
statements. Also the class may give some indication of the answer 
if a statement appeals to them as being ridiculous. 

5. The pupils should be given specific directions in regard to 
answering exercises about which they are uncertain. One writer 


6For an example of a true-false examination, see Appendix p. 69. 
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has suggested that the pupils be instructed to guess concerning the 
truth or falsity of the statement. Another writer who has used 
this type of examination instructed the pupils as follows: ‘“‘First, 
go through the list quickly and mark all that you know for certain, 
then go back and study out the harder ones. Do not guess; the 
chances are against you on guessing. Don’t endanger your score 
by gambling on those questions about which you know nothing.”’ 
This second procedure is probably the better. 


The scoring of a true-false examination. Since only two re- 
sponses are possible, it is obvious that a pupil may give a correct 
response as the result of chance. In order to take this possibility 
into account, a pupil’s score on an examination of this type is the 
number of exercises answered correctly minus the number answer- 
ed incorrectly. Exercises not attempted are not counted. 


Directions for constructing a recognition examination.’ In 
constructing this type of examination none of the proposed an- 
swers should be too obviously incorrect. An exercise can yield an 
indication of a pupil’s achievement only when he is forced to use 
judgment in determining which of the proposed answers is suit- 
able. For example, the illustrative exercise given would be practi- 
cally worthless for testing purposes if all the names, except that of 
George Washington, were of persons living today or of persons 
having noconnection with our national life. Inapplying this type of 
exercise to the field of arithmetic the proposed answers should include 
erroneous answers which pupils are inclined to give: if the exercise 
called for the quotient of two fractions, one of the proposed an- 
swers should be the product of the fractions and another their sum, 
and perhaps another should be the fraction obtained by taking 
the sum of the numerators as a new numerator and the sum of the 
denominators fora new denominator. When the correct answer is 
included in a group of such answers as these, the pupil who does 
not know how to find the quotient of such fractions will be unable 
to determine the correct answer except as a matter of chance. 
On the other hand, if all of the answers except the correct one were 
integers or were so large that they were obviously incorrect, a 
bright pupil who knew nothing about division of fractions would 
be able to select the correct answer. The correct answer should 
not always be found in the same position; sometimes it should be 


"For an example of a recognition examination see Appendix p. 75. 
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first, sometimes last, and sometimes in an intermediate position. 
As in the case of the true-false examination, a recognition ex- 
amination should consist of a large number of exercises. 

Examinations of this type should be mimeographed or printed 
and each pupil should have a copy. Definite instructions concern- 
ing methods of work should be given. It is probably best to in- 
struct the pupil to work through the test rapidly, answering those 
exercises about which he is certain. He should then go back over 
the list and try the more difficult ones. Not fewer than four pro- 
posed answers should be included in each statement and the 
pupils may be instructed to guess if they do not know, since the 
chance of success by guessing is slight. The pupil’s score on an 
examination of this type may be taken as the number of exercises 
done correctly. 

A somewhat unusual but interesting type of recognition ex- 
ercise is that described as a “matching contest.”’ In this a pupil 
is given two lists of statements, the first numbered 1, 2, 3, 4, 5, 
etc., the second marked A, B, C, D, E, etc. In the second list, 
there is a statement which corresponds in meaning to a statement 
in the first list and the pupil is to pair these statements, marking 
by the number of the first list the letter of the corresponding state- 
ment of the second. For example, in the exercises given below: 
by the date marked (5), 1898, we place the letter B to indicate the 
event for which that date is significant. It is difficult to construct 
such examinations so that they will require reasoning on the part 
of the student. Their most important use is in the elementary 
school for rapid drill in certain phases of some subjects, such as 
definitions in geography and grammar, etc. The following exer- 
cises, selected from the Spokane United States History Test, 
illustrate the use of such an examination in linking a certain date 
or person with the corresponding event. 


1. 1846 A. Lincoln’s Emancipation Proclamation 
2. 1865 B. Spanish-American War 

3. 1863 C. Beginning of World War 

4. 1917 D. Declaration of Independence 

5. 1898 E. United States entered World War 

6. 1789 F. Election of Washington as President 
emt 92 G. War with Mexico began 

8. 1776 H. Invention of the cotton gin 

9. 1861 I. Lee’s surrender at Appomattox 
10. 1914 J. Beginning of Civil War 
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1. Foch A. Destroyed Spanish fleet in Manila Bay 

2. Lincoln B. Invented the telephone 

3. Fulton C. Leading Confederate General 

4. Dewey D. Wrote the Declaration of Independence 

5. Pershing E. Invented the steamboat 

6. Bell F. Commanded allied armies in the World War 

7. Edison G. Was President during the Civil War 

8. Jefferson H. Commanded American Forces in the World War 
9. Lee I. Was Revolutionary patriot, author, and inventor 
10. Franklin J. America’s most famous inventor 


Directions for constructing completion exercises.* A com- 
pletion exercise should be constructed so that no suggestion will be 
given of the correct. words to be written in the blanks. Further- 
more, the facts to be supplied should be important. The best 
plan is to prepare a list of important statements and principles 
covering the portion of the subject over which the pupils are to be 
examined and then from these statements to strike out a certain 
significant word or phrase. In every case, if it is possible, the 
words omitted should be such that only one answer will be correct. 
Since little writing is required of the pupils they may be asked to 
fill in as many as one hundred blanks. 

The scoring of completion exercises is not as highly objective 
as in the two types mentioned above. Pupils will tend to write a 
variety of words in the blanks. Different words may have almost 
the same meaning, and frequently the scorer will be compelled to 
determine whether the meaning of some word is sufficiently near 
that of the correct answer to justify giving the pupil credit for 
having answered the exercises correctly. However, by a careful 
selection of statements and of the omitted words, this subjectivity 
may be greatly minimized. For example, in the sentence, “The 
first Continental Congress was held in . . . .,” only one 
possible word can be correct. In using completion exercises it is 
necessary to provide each pupil with a mimeographed or printed 
copy of the examination. The pupil’s score is the number of 
blanks filled in correctly. 

Advantages of the ‘‘new examination.”? Examinations con- 
sisting of exercises of the types described above have certain ob- 
vious advantages. There will be a large saving of time for both 
teacher and pupil. The pupil is called upon to do little or no 


*For an example of a completion examination see Appendix p. 73. 
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writing in giving his answers and therefore is able to respond to a 
large number of exercises. The teacher in scoring will have little 
or no occasion to use judgment as he will need only to note the 
brief responses given by the pupils. Thus the labor of scoring will 
be greatly reduced and, more important, the scoring will be much 
more highly objective than that in the marking of examination 
papers of the usual type. The saving of time in the giving and 
scoring of the “new examination’”’ will more than offset any ad- 
ditional time that may be expended in its construction. Another 
advantage is that the new examination can be made more com- 
prehensive. Examinations as a rule consist of ten questions. Some 
are limited to a smaller number. Consequently the scope of ex- 
aminations of the traditional type is necessarily narrow. ‘“New 
examinations” of the true-false type should consist of not less than 
fifty exercises and may have as many as one hundred. Other types 
of the “new examination” should be of a corresponding length. 
Hence a “new examination” will usually be more comprehensive 
than a traditional examination. 

Limitations of the ‘new examination.’? There are certain 
limitations of the new examination which should be noted along 
with its advantages. It can not be used in mathematics except toa 
limited extent. It can not be used at all in English Composition. 
In other subjects there are many phases of achievement which are 
not measured directly by examinations made up of exercises of the 
types described above. Hence, altho the ‘new examination” is 
more comprehensive with reference to information, and does meas- 
ure certain types of achievements, it is likely that pupils would 
miss much valuable experience and training if they were not at 
times asked to “‘compare,” “explain,” “discuss,” “‘define,” or 
“tell why.” They should also be asked to summarize material 
presented on a topic or to apply certain principles. The following 
questions taken from Hahn’s Scale for Measuring Ability of 
Children in History appear to require mental processes distinctly 
different from those called for by the “‘new examination.” 

1. “State points of similarity between the position of the United 
States in 1812 and their position in 1912.” 

2. “Arrange the following events in order of cause and effect: 
Force Bill, Carpet Baggers, Fifteenth Amendment, Negro Rule in 
Some of the Southern States, Ku Klux Klan.” 

3. “Name the presidents of the United States since 1892.” 
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An intelligent attitude toward the ‘“‘new examination.’’ The 
simple administration of the new examination and other attrac- 
tive features should not blind one to the limitations just mention- 
ed. As indicated in Chapter II written examinations do more 
than merely secure measures of achievement. If they consist of the 
right kind of exercises they afford significant educational oppor- 
tunities. The educational opportunities of the “new examination” 
are necessarily restricted, and it would be unfortunate if it entirely 
replaced examinations of the traditional type. The new examina- 
tion, however, has a place. It may be used occasionally in most 
school subjects. It is useful when a teacher wishes to test the 
acquaintance of a class with a wide range of facts. It has little 
diagnostic value and examinations of the traditional type should 
be used when information is desired concerning the weaknesses 
of different members of a class. For this reason the ‘“‘new exami- 
nation” is more appropriate for use at the end of a term than for 
tests during the term which have as their purpose both measure- 
ment and diagnosis. 
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CHAPTER VII 


RULES FOR THE PREPARATION AND ADMINISTRATION 
OF WRITTEN EXAMINATIONS. 


Below, a group of suggested rules governing the preparation 
and administration of written examinations are given. These 
represent the opinion of the writers which is based upon a careful 
study of the problems involved, as well as upon several years of 
experience in the measurement of school achievement. 

1. Final examinations should be required. In school subjects 
such as shop work, in which the performances secured from pupils 
are highly objective, the waiving of this requirement may be justi- 
fied. When final examinations are given no student should be 
excused from them because of high daily grades, deportment or 
attendance. (See p. 16) 

2. The content of final examinations should agree as closely 
as possible with recognized educational objectives. In fields where 
minimum essentials have been determined they should be used as 
a basis in formulating questions. (See p. 55) 

3. The questions should be definite and stated so that all 
pupils will interpret them alike. Questions relating to items of 
minor importance should occupy a minor place in examinations. 
Questions relating to points which have not received attention in 
the course should be omitted.! (See also rule 11.) 

4. When the necessary equipment is available the questions 
should be mimeographed or typewritten so that each student will 
have a copy on his desk. In case they are written on the board the 
teacher should make certain that all pupils are able to read them 
correctly. It is well in either case to read the questions aloud to 
the class. 

5. The examination should be sufficiently difficult so that 
few pupils will make perfect scores. (This rule should not apply 


1Frequently in an examination the difficulty of the question is due to the lack of 
emphasis placed upon it throughout the term because it deals with a relatively un- 
important topic. Other topics, difficult in themselves, but emphasized because of 
their importance, furnish the easier questions. 
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when the purpose of the examination is to determine which stud- 
ents have attained a given standing which includes perfection of 
performance.) (See p. 12) 

6. Usually the examination should be long enough so that 
every member of the class will be kept busy for the entire period. 
It is better to make the examination too long than to have it too 
short. In this way it becomes possible also to take into account 
the student’s rate of work in determining his grade. Appropriate 
adjustments can be made in interpreting “scores” into school 
‘“‘marks.”’ (See p. 22 and 13) 

7, In questions asking for a discussion or explanation indi- 
cate the completeness of the discussion or the degree of elaborate- 
ness expected in the answer. (See p. 24) 

8. Time may be economized for both students and teacher by 
using some form of the ‘“‘new examination.”’ This type of measur- 
ing instrument, however, possesses certain limitations which 
should be kept in mind. The exclusive use of it would be unwise. 
(See p. 61) 

9. Unless the students have a definite understanding of the 
methods of work which are to be followed, the teacher should give 
them explicit directions concerning such matters as the order in 
which the questions are to be answered, the desired arrangement 
of the work, and any other items in which there is an opportunity 
for pupils to adopt different procedures. (In the case of most 
standardized educational tests, the directions to students are very 
detailed and explicit.) (See p. 24) 

10. Approximately ninety minutes should be allowed for a 
final examination in most high-school subjects. The time which 
teachers should devote to the preparation of the questions will de- 
pend upon their experience and upon their practise during the 
semester. It is recommended that a teacher make a record 
throughout the term of questions which in his judgment are 
suitable for a final examination. From two to three hours should 
be sufficient for marking a set of twenty-five examination papers. 
If a teacher finds that a longer time is required, he should en- 
deavor to modify his procedure so that this work can be done 
more quickly. (See p. 19) 

11. Altho any weighting of questions by a teacher will be 
subjective, it is probably desirable to weight the questions, par- 
ticularly in cases of extreme differences in value. The weighting, 
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however, should be upon the basis of social importance rather than 
upon mere difficulty. (See p. 22) 

12. It is advisable for the teacher to write out at least in an 
abbreviated form the answers to examination questions before he 
begins marking the papers. In mathematics and other subjects 
where a definite answer is required and only one can be accepted 
as correct, the need for this rule is not as great as in such subjects 
as geography, history, literature and certain phases of science. 
However, a list of correct answers will usually mean a saving of 
time. (See p. 23) 

13. Except in courses in English a pupil’s grade should not be 
intentionally lowered for errors in spelling or for poor handwriting. 
As a rule the grade should not be lowered because of poor English, 
unless the quality of the English is evidence of unsatisfactory 
reasoning. Rules covering these points, as well as others con- 
cerning which teachers might differ, should be formulated by the 
principal in conference with his teachers or at least a committee 
of them. The rules thus formulated should be carefully followed 
by all of the teachers. (See p. 20) 

14. In marking the papers more accurate results will in gen- 
eral be secured if the answers to one question are marked on all 
the papers before those for another question are taken up. (See 
p. 23) When it is desired to mark all of the questions on one paper 
before taking up another, the “‘sorting method” should be used. 
According to this procedure the papers as they are read are sorted 
into piles, the best ones being placed in the first pile, the next best 
in second pile, etc. Five distributions will, in most cases, prove 
sufficient. After all the papers have been distributed they should 
be reread, one pile at a time, and compared with each other. If 
these papers do not possess approximately the same value, changes 
in the sorting may be made. Grades may then be assigned to the 
papers in the different piles. 

15. The distinction between “‘scores”’ and “grades” should be 
kept in mind. (See p. 11) The papers should be marked first 
in terms of scores. In doing this an appropriate number of points 
should be determined for each question. It is not necessary that 
the total of these points be 100. (See p. 52) 

16. The point scores assigned to the examination papers 
should be translated into school marks. In doing this the use of a 
standard distribution will be found helpful, and will operate also 
to decrease the magnitude of the constant errors. (See p. 50) 
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APPENDIX 
(Questionnaire Sent to High-School Teachers) 


QUESTIONNAIRE RELATING TO THE USE OF WRITTEN EXAMINA- 
TIONS IN HIGH SCHOOLS 


a hermajor subjectawhich, beam teaching ig.sc.sveeescteetecce: ces: eetec sas seaeateries caseceeceet stan evereemeaeees 


The following questions are to be answered only with reference to the 
major subject you are teaching. 


1. Approximately how much time do you use in preparing questions 
for a final examination which the students are allowed a total of 90 min- 
ALCES: CO ASWELL sco sccvsssacrocssscsoceebreStstere sha eececcece tote oe SARTO Ten os cc ee eee 


(“Final examination” as used in this questionnaire means an examination 
which is given at the end of a semester and which is based on the work of 
the entire semester.) 


2. In preparing a set of examination questions do you usually at- 
tempt to arrange these questions in order of ascending difficulty?............ Wes’ Now 


3. Do you prepare in written form carefully worded directions to the 
students regarding the procedure they are to follow in answering the 
questions? (These directions might include such points as, order in which 
questions are to be answered, length of answers, arrangement of work, etc.) Yes No 


4. Which of the following methods do you use in presenting ques- 
tions sto, theistuden ty. Atm. Gin ete veceessces etree acke ch eee er oe ee 


(a) Writing the questions on the board......cccessscsesessesesseseenereese Yes No 

(b) Furnishing the pupils with a mimeographed, carbon or 
printedicopyzot theaquestionciam smn en eset ee cee Yes No 

(c) Dictating the questions to the pupils...........ccccccccsessseseeneeseeee Yes No 


5. Is it your custom to make the examination long enough so that 
practically none of the students will answer all of the questions in the 
time? allowed f.chrasacssec Seer TE Re ne Yes No 


*Underline the answer (Yes-No) which you desire to make, 
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6. Is it your custom to make the examination short enough so that 
practically all of the students will answer all of the questions in the time 
allowed P.:.5.cccsccsceecse, Sache Fane ee et pee ec catapennes toe eee a3 

7. If you have given an affirmative answer to Question 6, do you note 
the time each student spends in writing his answers? ...c.cecccccccccsseseececccsss.ss. 

8. What proportion of the final mark for the semester is based upon 
the final written examination grade? .......ccccsccssssssssssssssssossosssseccesecccecococecccoce 


9. In assigning grades to examination papers do you attempt to have 
their distribution conform to any standard form such as the normal dis- 
ELST SRR Sw ts aes Pn a sek SR Mc ri Sg a 


10. Do you usually grade all the answers on one paper before taking 
UPRthoseromanotneripapenteee see wo ee ee an Le 


11. Do you usually grade the answers to one question on all of the 
papers before taking up the answers to a second queStion?.......ccsccssecsessses 


12. Instead of marking the answers to each question separately do 
you attempt to estimate the value of the paper as a whole?........-cceeecsscosees 


13. Before starting to grade a set of examination papers do you write 
out the answers which you Consider COrrect?.......csccssccccscoscecessccssssscesssesosesees 


14. When you consider the questions of an examination to be un- 
equal in difficulty is it your practise to give more credit for a correct an- 
swer toa difficult question than for a correct answer to an easy question? 


15. Approximately how much time do you use in marking the papers 
of a final examination which the students are allowed a total of 90 min- 
utes for answering? Estimate as accurately as possible. Base this answer 
Omarelass Ole SiS Gen ts aus Foetal vests Sesteniaie dee Procdee as 


16. In marking examination papers do you intentionally lower a stu- 
dent’s mark in the case of 

(ao Oo tact Gini eee rate ween errr nay eee chy nei, Reensterne 

(by Bnootespe lin oer cee etter ment ee cant oe Ts coh oty eaas cactak ated 

(G)sDOO RPE i glishtesams eee ee eens eg Re osc scapes vote aie codes 


17. In the case of questions which are essentially mathematical in 
character do you give credit to the student for using the correct principle 
even though the final answer be Wrong?..........c.ccssscscccsesssesssssseesserecesesseeevanees 


Yes No 
Yes No 
Sass: % 
Yes No 
Yes No 
Yes No 
Yes No 
Yes No 
Yes No 
Recess Min. 
Yes No 
Yes No 
Yes No 
Yes No 


NOTE: It is desired that only teachers of mathematics, physics, and chemistry answer No. 17, 


NOTE TO TEACHER: When you have answered the above questions please return this ques- 


tionnaire to your principal. 
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(Questionnaire Sent to High-School Principals) 


Principal eeentaee rare re eaters EVigh yess etter Cit yzerrectice np 


1. Do you require your teachers to give final examinations at the 
end Of Cachisemester tars otescrce ccd eee eects teats tel encsansanesneraiaatstnees Yes 
(“Final examination” as used in this questionnaire means an ex- 
amination which is given at the end of a semester and which is based on 
the work of the entire semester) 
If a negative answer is given to the first question, no answer is ex- 
pected for the remaining questions. 


2. a. Is it the practise in your school to exempt certain students 
fromufinalseXamina Conse escsssstecees cetera cee esseecssacasey erase smensseetavateraseececaeteas ee yes 
b. If so, what are the conditions (requirements) upon which you 
base exemption? 

1. Deportment.... ‘ 
DScholarshipsecssccstesctsscsstecassacreorserereeoteve esteemed 
Other requirements. 


3. How many minutes do you allow for final written examinations?.... .......... 


4. Because final examinations have proven unreliable some educators 
urge that students be given more than one comprehensive examination 
in each subject, and that these examinations be given on different days. 
Do you require more than one such final written examination in each sub- 
FECUR Lek yshee ene, eae iS tds hig he eee ange nay ee aes Yes 


5. a. In marking examination papers is it the practise in your school 
for the teachers to subtract from a pupil’s grade for 
1. poor writing...... 
2. poor spelling 
3. poor English 
b. Are your teachers accustomed to giving more credit for correct 
answers to difficult questions than for correct answers to easy 


c. Is it the practise of your teachers when computing a semester 
mark to add or deduct credit in proportion to the time used 
by a pupil in answering the final written examination questions? Yes 


6. a. Have you advised your teachers as to what proportion of the 
final mark for the semester should be based on the final written 
EXAMINATION fa Sees wetter eestor steers eee er mee eee Yes 

b. Have you made a definite requirement in this respect?............ Yes 
c. If so, what proportion of the final mark for the semester do you 
require to be based upon the final written examination mark?.. 


7. Additional information pertaining to this topic is called for in a 
a second questionnaire which is to be filled out by high school teachers. 
Would you be willing to distribute this questionnaire among your teach- 
ers? If so, will you kindly indicate the number of teachers in your high 
school? 


*Underline the answer (Yes-No) which you desire to make. 
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EXAMPLES OF “NEW EXAMINATIONS” 


(The following “new examinations” are given for purposes of illustration. They may in- 
clude several exercises which will prove unsuitable when given to pupils.) 


TRUE-FALSE EXAMINATION IN PHYSIOLOGY 
Prepared by 
Bureau of Educational Research 
University of Illinois 


IN aioe ene een ee yt Py Se Boyvor Girlie nae 
Avellaseibirthdayaseen. a. ss Nex & birthdayawallihe ws mente ne eee lo 
Grade nce eo Datere te se ess Clty eet weee ters oe Ane Stateke nate. cme 
RETO teas Eee a yee cite Sh atoetsaectenn eachenetn. Jere ote mens were eee hale 


Below you will find a number of statements. Some of these statements are true, 
others are not true. Read each statement carefully, then if it is true mark a plus (+) 
in the column to the right of the sentence. If the statement is not true mark a minus 
(—) in the column to the right. 


EXAMPLES | 


An to b 
Read the statement below very carefully. partials heres 


1. Fats will form a lasting mixture with water. 


This is not a true statement so you will place a minus (—) sign in the 
column. Now read the second sentence. 


2. The layer of fat just beneath the skin is more than one- 
tenth of an inch thick. 


This is a true statement so you will mark a plus (+) sign in the column. 
Now read the third sentence. 


3. The union of oxygen with any substance produces heat. 


This is a true statement so you will mark a plus (+) sign in the column. 
Now read the fourth sentence. 


4. Nitrogen constitutes only one-fifth of the volume of the air. 


This is a false statement so you will mark a minus (—) sign in the 
GOUT Reetee nore ttt ce ene ate tra, rice des shears Eade dees Sa est ee 


il 


2h 


17. 


18. 


WD), 


20. 


PHYSIOLOGY Answers to be 


written here 


The kidneys vary 12 inches to 16 inches in length. 


The external poisoning of the skin by poison ivy or sumac never 
results seriously. 


. A person having a good mind must necessarily have a large brain. 
. Color blindness is more prominent in men than women. 

. Plenty of fluids should be drunk at the time of eating solid food. 
. Bones are composed of animal and mineral matter. 

. The nails are hardened outer skin or epidermis. 

. The use of alcohol increases the tendency to commit crime. 

. A full grown person contains about six quarts of blood. 

. The brain is almost perfectly spherical in shape. 

. The kidneys are almost perfectly round. 

. All animals are made up of cells. 


. Substances, like glass, which permit rays of light to pass through 


them readily are said to be opaque. 


The sense organs of smell are located in the lining of the cavity of 
the nose. 


. The skin is composed of two layers of tissue. 


. The end organs for taste occur in the mucous membrane of the 


tongue. 


To extinguish the burning clothing of a person, it is necessary to 
wrap him in something to exclude the air. 


All of the interior of the spinal cord is filled with gray matter con- 
taining nerve cells. 


The great difference in the complexion of persons is due largely 
to the pigment lying in the epidermis. 


Cancer is caused by germs growing in the tissue. 
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23s 
24. 
25. 
26. 
27. 


28. 


2a). 
30. 
Sie 


Sl. 
Soe 


34. 


Bos 
36. 
Sie 
38. 


se), 
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Diphtheria can be controlled by the use of Diphtheria antitoxin. 


The brain is separated into two parts or hemispheres by a great 
longitudinal fissure. 


When oxygen is separated from other substances the process is 
called oxidation. 


Infectious diseases are due to changed methods of work and 
growth on the part of cells in certain regions of the body. 


The use of alcoholic beverages builds up the body and makes the 
muscles stronger. 


The great majority of grown people have been infected with 
tuberculosis germs. 


The sense organs are the terminations of the sensory nerves serv- 
ing to carry impressions to the spinal cord or brain. 


Farsightedness is often caused by a blow on the eye. 


An antiseptic is a substance which merely restrains the germs from 
growing. 


The brain is in communication with the rest of the body by means 
of nerves. 


The cerebrum is the path of communication between the nerves 
supplying the arms, trunk, legs, and brain. 


The chief function of muscles is to hold up the body. 
All milk contains bacteria. 


The alcohol used in drinks is produced by the growth of yeast in a 
liquid containing sugar. 


Our blood contains white corpuscles which destroy disease germs. 
More people die daily from diphtheria than from tuberculosis. 
The use of tobacco increases the strength of the muscles. 

The use of tobacco makes the nerve cells function more keenly. 


The chewing of dry bread aids the digestion as much as the use of 
gum. 
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Answers to be 
written here 


40. Air is composed chiefly of two gases, oxygen and nitrogen. poms diapeas 


41. The first step in treating a person who has been poisoned is to 
give an emetic. 


42. Light is produced by waves of a substance called ether. 


43. Non-infectious diseases are caused by small plants or animals 
called parasites feeding upon the human body. 


44, Alcoholic beverages have great value in curing disease. 


45. A drink of alcoholic beverage in the winter time causes a man’s 
body to become warm. 


46. Each portion of the brain has its own definite work to perform. 


47, Fainting is caused by an over-sufficient supply of blood being 
sent to the brain. 


48. The spinal cord may act independently of the brain and produce 
many of the muscular movements necessary in routine work. 


49. The germs of typhoid fever usually gain access to the body by 
being breathed in with air. 


50. Narcotics are substances which cause any organs of the body to 
act more vigorously than is their custom. 


Directions to teachers: After the four examples have been studied by the 
pupils, read the following directions to them: “On the next page you will find a num- 
ber of statements similar to the ones you have just read. You are to place a plus sign 
or a minus sign in the column to the right of each statement just as has been done on 
the first page. Mark all of the statements that you are sure you can answer correctly. 
If you find a statement that you are not sure you can answer correctly, study it care- 
fully and then mark the answer you think will be correct. If you find a statement you 
know nothing about, make no attempt to mark it, as guessing counts heavily against 
you. You will have 25 minutes for the test. I shall expect you to stop promptly and 
turn your folders face down on the desk when I tell you to do so. Ready-Go.” 

In computing the score of each pupil on a test subtract the total number of wrong 
answers from the total number of right answers. Such scores are called “‘point-scores.”” 
In interpreting them it is advisable to form a distribution which will show how many 
pupils received each score. From the distribution it is possible to work out a basis for 
translating the point scores into the usual kind of school marks. 


NOTE: The “Directions to teachers’ given above would not appear on the usual printed 
examinations. They are placed here for the convenience of teachers. 
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COMPLETION EXAMINATION IN AMERICAN GOVERNMENT 


Prepared by 
Bureau of Educational Research 


University of Illinois 
IN Aree ee cee ee se) ae eer Boysor Girlea eee eee es 
Age last pirthd ayacnenct tree: iNextibirthdayzwillibesss se et te nee: LO eae 
(Grademre sae eee Da terse ce See City eee ee Statetencs eae 
Schooler etn rece tee eh toes era Teachernet ae ee ae eee eas 


Below you will find a number of statements. In each statement one or more im- 


portant words have been omitted. Each blank in the sentence shows where a word 
has been left out. Read each statement carefully, then write in the blank the word 
which completes the meaning of the statement. You will be allowed 15 minutes for 


the test. 
1. The primary purpose for which government exists is the...........00000 of our lives 
and property. 
2. Citizenship may be acquired Dby...........-...:0-0+ in this country or by a process of 
Wasa leer eee for natives of other lands. 
3. Our national government derives its authority from the... of the United 


States through our national...............:.:csssssssesees 


. The legislative power granted to the national government is vested in a Congress of 


houses, the smaller of which is called the... and the 


. The execution of the laws made by... IS MMICTUSteC atone here semesters 


of the United States. 


. All judges connected with the national courts are appointed for life with the consent 


. Most of the candidates for office which are filled by popular vote are nominated 


dine ctlyaincewre es ssetee esteeae eens 


. The Fifteenth Amendment of the United States Constitution prevents the states 
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9. Practically all of our law-making bodies are made uP Of.s.....sessesssesseeneeeneseenners chosen 
forishort terms trom se sees: into which the states, counties and cities are 
divided. 

10. The first permanent English settlements in America were made in what is now the 
SCALE ION as cpsettccactcte teres ctiantta tetas 


11. Ina county, the records of the county board and other official papers are preserved 


bythe (Coun tysans cesta ter ctec senor scect- nee cs 
12. All cities are public corporations created under..........ccseseseeseeeee municipal laws. 
13. Every incorporated city obtains from the...........cs0e QOVELNMENE A sessecsseessseeeee 


under which it may elect its officials and conduct its business. 
14. Civil service employees may be removed from service only fOF........:.cccccseseseeeseeeeeeee 
15. The power of impeaching a state officer is given to the...............-.sscuccecesseeesnsceesssesece 
1Gse Wher eter ere ars is by far the most prominent and powerful executive official 
iM ENeNState suv CLYR ences eae es state officers are appointed by him or are re- 
sponsible to him. 


17. All important officials connected with the executive or judicial service of the 


Wnitedes tatesmma ya ber removeds Dysart neree through the lower house of 
(Conerescucin deb yaw eee: in the senate. 
18. Far more property is destroyed Dy...........scscessecseees than by all other agencies. 


19. There is no task of state and local government which outranks in importance that 
OLEPLOVICIN Quan eee mee ersten reer education at public expense. 
20. All rivers and canals within a single state are controlled by the....c.c.cccccccccccecceceeeees 


in which they are located. 


21. Most of the revenue for state and local governments is secured by a....cescecccsessseeses 
ONW be iheite ae a 
D2 PA (statel. ny meen is the fundamental law which the people of the state have 


arranged for their government and protection. 
23. A state constitution can be changed by means Of Afi.cccccssscssesssssessesssecsoecseceeeseccccoeeeses 
24. The three-fifths compromise provided that five..ccescccccscccscseses should be counted as 
eqiialitomth Tee meres eee when reckoning the,.......cscssssecsesercesess for either direct 


taxation or representation. 
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RECOGNITION EXAMINATION IN ALGEBRA 


Prepared by 
Bureau of Educational Research 


University of Illinois 
UNE WESTER da on A ae ah cee Pe AR Sal ea A Boyor Girleen oe ei ee ee 
Aveilas Gipikthdayincecssnte secs INextibixthdayawillibemusemrene secre means mae LOR 
Grade mee eee Datevwcs we Git Vie a hee ioe eer ae StAtehen eet ete eee ce: 
S choo] Rasa pee ene ee aoe, ee Mcacherserere eres ee ae ee ee 


Below you will find a number of statements. In each statement a word or number 
has been omitted. At the close of the statement several words or numbers have been 
given. One of these is the correct answer. Select the word or number which you think 
is correct and draw a line under it. Most, if not all, of the examples can be solved by 
mental calculation. If any figuring is necessary, work on the margin of the page. 
You will be allowed 17 minutes for the test. 


1. Numbers that are represented by letters are called... eee numbers. 


substituted—literal 


2. When two or more letters are multiplied together each is called a... eseseeesesteeeees 
of the product. factor—coefficient 
3. Ifa man rides a certain distance in 10 hours, in h hours he rides.............sesscsesesesseeess 
hid 
10h; 10 k 
4, The statement 2x + 5 = 29 is called an... identity—equation. 


5. If 16 is subtracted from three times a certain number the result is 110. The number 
ig Sate tN 3624; 31%; 42 
6. A number which is a factor of two or more numbers is called a..........:esee factor. 


common—equal 


7. If there are two equal factors of a number, either is called the.........ss.-ssssss of the 
number. square root—common factor 
8. To multiply algebraic fractions take the... of the numerators for a 


new numerator and the product of the denominators for a new denominator. 


sum—product 
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9. A fraction whose numerator or denominator (or both) contains fractions is called a 


Metairie ti aernare fraction. multiple—complex 
TOR AC I cheer is a statement of a fact which is to be proved. theorem—axiom 
11. The name given the + sign i8............sescessseseseees negative—positive 


12. To find the sum of two numbers whose signs are opposite, take their............::::0e0 
regarding each as positive, and prefix the sign of the larger number to the answer. 


sum—difference—product 


13. Whenever a number occurs without a sign, the..........:.s:scsesseseee sign is to be under- 
stood. X3;+3- 
14. The number denoting the power of a term is called the..............cccccsesesesseseereseseeeeneseneee 
prefix—exponent 
2abc 
15. Ifa=2,b = —3,andc = —5 then Be creseeeeeesisssenceenes — 6; — 2; 30 


16. In adding like terms add the coefficients for the new coefficient and..............:.00 
it by the common factor. multiply—divide 


17. An expression which contains more than one term is called a.......eccceesesseseeeseenenees 


monomial—polynomial 

18. If the length of a rectangle = 4 feet more than twice the width, the perimeter = 
DOLCC Comme ne Neng tht —— sterner seceees feet. 8—12—16 

19, Any term may be transposed from one side of an equation to the other, provided 
LtSGerturtcr see ee ree: is changed. sign—value 

20. Any equation which contains no higher power of the unknown letter than the first 
isticalled|sa.sgeacerer ceria. equation. radical—simple 

21. The exponent of the product of two powers of the same number is equal to the 
aren teas: of the exponents of the factors. product—sum 

22. To raise the product of two numbers to any power, raise the numbers separately 
to that power and take their... product—sum 


23. The square of any two numbers is equal to the square of the first number 


twice the product of the two plus the square of the second number. plus—minus 


24. A 20 foot ladder rests against a building, the bottom of the ladder being 12 feet 


from the cellar wall. The top is..........0... feet from the ground. 8-16 
25. In division, the sign of the quotient i8.......ccccccccssssceees whenever the dividend and 
divisor have like signs. Sy ar 
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26. 


27. 


28. 


29 


30. 


31. 


In finding the quotient of two powers of the same number the exponent of the 


quotient is equal to the exponent of the dividend..........ccc00000. by that of the divisor. 
increased—diminished 

(3x2—2x —1) + (XK) Scene 3x+1;3x—1 

A factor which has no factor except itself and unity is called a....cccccsssesssesse: factor. 
prime—multiple 


The product of all the common prime factors of two or more numbers or expres- 


sionsiis called theirs... ses sene common factor. highest—lowest 

If one number is exactly divisible by another, the first is called a......cccccsccscss-s of the 
second. divisor—multiple 

In algebraic fractions the dividend is called the......c.ccesecssssseseeresoeeee 


denominator—numerator 
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